row filtering for logical replication
Hi,
The attached patches add support for filtering rows in the publisher.
The output plugin will do the work if a filter was defined in CREATE
PUBLICATION command. An optional WHERE clause can be added after the
table name in the CREATE PUBLICATION such as:
CREATE PUBLICATION foo FOR TABLE departments WHERE (id > 2000 AND id <= 3000);
Row that doesn't match the WHERE clause will not be sent to the subscribers.
Patches 0001 and 0002 are only refactors and can be applied
independently. 0003 doesn't include row filtering on initial
synchronization.
Comments?
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachments:
0001-Refactor-function-create_estate_for_relation.patchtext/x-patch; charset=US-ASCII; name=0001-Refactor-function-create_estate_for_relation.patchDownload
From ae95eb51dd72c2e8ba278da950b478f6c6741fc0 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH 1/3] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 04985c9..6820c1a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -183,7 +183,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -193,12 +193,12 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
estate->es_range_table = list_make1(rte);
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -584,7 +584,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel));
@@ -688,7 +688,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel));
localslot = ExecInitExtraTupleSlot(estate,
@@ -806,7 +806,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel));
localslot = ExecInitExtraTupleSlot(estate,
--
2.7.4
0002-Rename-a-WHERE-node.patchtext/x-patch; charset=US-ASCII; name=0002-Rename-a-WHERE-node.patchDownload
From 8421809e38d3e1d43b00feaac4b8bccaa6738079 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH 2/3] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d99f2be..bf32362 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -468,7 +468,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3734,7 +3734,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3831,7 +3831,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.7.4
0003-Row-filtering-for-logical-replication.patchtext/x-patch; charset=US-ASCII; name=0003-Row-filtering-for-logical-replication.patchDownload
From 6567e49f95823532bc2ceccf87ed570ca4ce398d Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH 3/3] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/ref/alter_publication.sgml | 9 ++-
doc/src/sgml/ref/create_publication.sgml | 14 ++++-
src/backend/catalog/pg_publication.c | 45 +++++++++++--
src/backend/commands/publicationcmds.c | 69 ++++++++++++++------
src/backend/parser/gram.y | 26 ++++++--
src/backend/parser/parse_agg.c | 10 +++
src/backend/parser/parse_expr.c | 5 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 98 ++++++++++++++++++++++++++++-
src/include/catalog/pg_publication.h | 8 ++-
src/include/catalog/pg_publication_rel.h | 11 +++-
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +++-
src/include/parser/parse_node.h | 3 +-
src/include/replication/logicalrelation.h | 2 +
src/test/subscription/t/001_rep_changes.pl | 29 ++++++++-
17 files changed, 299 insertions(+), 47 deletions(-)
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598..dc579b2 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE <replaceable class="parameter">condition</replaceable> ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE <replaceable class="parameter">condition</replaceable> ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,10 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">condition</replaceable> will
+ not be published.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index bfe12d5..e42f3d4 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE <replaceable class="parameter">condition</replaceable> ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">condition</replaceable> will not be
+ published.
</para>
<para>
@@ -184,6 +187,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index ba18258..43d754d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,10 @@
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -142,18 +146,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = heap_open(PublicationRelRelationId, RowExclusiveLock);
@@ -173,10 +180,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,6 +210,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add row filter, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prrowfilter - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prrowfilter - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -203,11 +232,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the row filter expression */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
heap_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 9c5aa9e..96347bb 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -324,6 +324,27 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid the
+ * WHERE clause in it. The use of relation_expr_list node just for the
+ * DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -345,9 +366,9 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ PublicationRelationQual *newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == oldrelid)
{
found = true;
break;
@@ -356,7 +377,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
if (!found)
{
- Relation oldrel = heap_open(oldrelid,
+ PublicationRelationQual *oldrel = palloc(sizeof(PublicationRelationQual));
+ oldrel->relation = heap_open(oldrelid,
ShareUpdateExclusiveLock);
delrels = lappend(delrels, oldrel);
@@ -479,16 +501,18 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst(lc);
- Relation rel;
- bool recurse = rv->inh;
- Oid myrelid;
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = t->relation;
+ Relation rel;
+ bool recurse = rv->inh;
+ Oid myrelid;
CHECK_FOR_INTERRUPTS();
@@ -507,7 +531,10 @@ OpenTableList(List *tables)
heap_close(rel, ShareUpdateExclusiveLock);
continue;
}
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
if (recurse)
@@ -537,7 +564,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = heap_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -558,10 +589,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- heap_close(rel, NoLock);
+ heap_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -577,13 +610,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -609,8 +642,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
ObjectIdGetDatum(pubid));
@@ -622,7 +655,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bf32362..94cdd7d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -396,13 +396,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9520,7 +9520,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9551,7 +9551,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9559,7 +9559,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9567,7 +9567,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9577,6 +9577,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 377a7ed..7e1c3d8 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -522,6 +522,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in CALL arguments");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE conditions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE conditions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -902,6 +909,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CALL_ARGUMENT:
err = _("window functions are not allowed in CALL arguments");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE conditions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 385e54a..7bd1695 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1849,6 +1849,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE condition");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3475,6 +3478,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "PARTITION BY";
case EXPR_KIND_CALL_ARGUMENT:
return "CALL";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication WHERE";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 2a4ac09..8e9cc58 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2293,6 +2293,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CALL_ARGUMENT:
err = _("set-returning functions are not allowed in CALL arguments");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE conditions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 6820c1a..fe0a6ca 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -182,7 +182,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index d538f25..30bdefa 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,13 +12,23 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
#include "utils/lsyscache.h"
@@ -56,6 +66,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *row_filter;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -286,6 +297,62 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->row_filter) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, InvalidBuffer, false);
+
+ foreach (lc, relentry->row_filter)
+ {
+ Node *row_filter;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+ char *s = NULL;
+
+ row_filter = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(row_filter);
+ expr = (Expr *) coerce_to_target_type(NULL, row_filter, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ return;
+
+ s = nodeToString(row_filter);
+ elog(DEBUG2, "filter \"%s\" was matched", s);
+ pfree(s);
+ }
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -506,10 +573,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = false;
+ entry->row_filter = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
/*
* Skip tables that look like they are from a heap rewrite (see
@@ -543,9 +614,25 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prrowfilter, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+ entry->row_filter = lappend(entry->row_filter, rf_node);
+ MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", s, pub->name, get_rel_name(relid));
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -620,5 +707,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->row_filter) > 0)
+ list_free(entry->row_filter);
+ entry->row_filter = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 37e77b8..28962e6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -86,6 +86,12 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -94,7 +100,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 033b600..585f855 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -29,8 +29,12 @@
CATALOG(pg_publication_rel,6106)
{
- Oid prpubid; /* Oid of the publication */
- Oid prrelid; /* Oid of the relation */
+ Oid prpubid; /* Oid of the publication */
+ Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prrowfilter; /* nodeToString representation of row filter */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -45,8 +49,9 @@ typedef FormData_pg_publication_rel *Form_pg_publication_rel;
* ----------------
*/
-#define Natts_pg_publication_rel 2
+#define Natts_pg_publication_rel 3
#define Anum_pg_publication_rel_prpubid 1
#define Anum_pg_publication_rel_prrelid 2
+#define Anum_pg_publication_rel_prrowfilter 3
#endif /* PG_PUBLICATION_REL_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 74b094a..499d839 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -471,6 +471,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index ac292bc..9800acf 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3419,12 +3419,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3437,7 +3444,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 0230543..8e3c735 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -69,7 +69,8 @@ typedef enum ParseExprKind
EXPR_KIND_TRIGGER_WHEN, /* WHEN condition in CREATE TRIGGER */
EXPR_KIND_POLICY, /* USING or WITH CHECK expr in policy */
EXPR_KIND_PARTITION_EXPRESSION, /* PARTITION BY expression */
- EXPR_KIND_CALL_ARGUMENT /* procedure argument in CALL */
+ EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index d4250c2..32f1312 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern Oid logicalrep_typmap_getid(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index e0104cd..d40ae03 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 16;
+use Test::More tests => 17;
# Initialize publisher node
my $node_publisher = get_new_node('publisher');
@@ -31,6 +31,8 @@ $node_publisher->safe_psql('postgres',
"CREATE TABLE tab_mixed (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_mixed (a, b) VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter (a int primary key, b text)");
# Setup structure on subscriber
$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_notrep (a int)");
@@ -39,6 +41,8 @@ $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_full (a int)");
$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_full2 (x text)");
$node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rep (a int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter (a int primary key, b text)");
# different column count and order than on publisher
$node_subscriber->safe_psql('postgres',
@@ -54,10 +58,12 @@ $node_publisher->safe_psql('postgres',
);
$node_publisher->safe_psql('postgres',
"ALTER PUBLICATION tap_pub_ins_only ADD TABLE tab_ins");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_row_filter FOR TABLE tab_rowfilter WHERE (a > 1000 AND b <> 'filtered')");
my $appname = 'tap_sub';
$node_subscriber->safe_psql('postgres',
-"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub, tap_pub_ins_only"
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub, tap_pub_ins_only, tap_pub_row_filter"
);
$node_publisher->wait_for_catchup($appname);
@@ -76,6 +82,25 @@ $result =
$node_subscriber->safe_psql('postgres', "SELECT count(*) FROM tab_ins");
is($result, qq(1002), 'check initial data was copied to subscriber');
+# row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter (a, b) SELECT x, 'test ' || x FROM generate_series(990,1003) x");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002
+1003|test 1003), 'check initial data was copied to subscriber');
+
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_ins SELECT generate_series(1,50)");
$node_publisher->safe_psql('postgres', "DELETE FROM tab_ins WHERE a > 20");
--
2.7.4
On Wed, Feb 28, 2018 at 08:03:02PM -0300, Euler Taveira wrote:
Hi,
The attached patches add support for filtering rows in the publisher.
The output plugin will do the work if a filter was defined in CREATE
PUBLICATION command. An optional WHERE clause can be added after the
table name in the CREATE PUBLICATION such as:CREATE PUBLICATION foo FOR TABLE departments WHERE (id > 2000 AND id <= 3000);
Row that doesn't match the WHERE clause will not be sent to the subscribers.
Patches 0001 and 0002 are only refactors and can be applied
independently. 0003 doesn't include row filtering on initial
synchronization.Comments?
Great feature! I think a lot of people will like to have the option
of trading a little extra CPU on the pub side for a bunch of network
traffic and some work on the sub side.
I noticed that the WHERE clause applies to all tables in the
publication. Is that actually the right thing? I'm thinking of a
case where we have foo(id, ...) and bar(foo_id, ....). To slice that
correctly, we'd want to do the ids in the foo table and the foo_ids in
the bar table. In the system as written, that would entail, at least
potentially, writing a lot of publications by hand.
Something like
WHERE (
(table_1,..., table_N) HAS (/* WHERE clause here */) AND
(table_N+1,..., table_M) HAS (/* WHERE clause here */) AND
...
)
could be one way to specify.
I also noticed that in psql, \dRp+ doesn't show the WHERE clause,
which it probably should.
Does it need regression tests?
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On 1 March 2018 at 07:03, Euler Taveira <euler@timbira.com.br> wrote:
Hi,
The attached patches add support for filtering rows in the publisher.
The output plugin will do the work if a filter was defined in CREATE
PUBLICATION command. An optional WHERE clause can be added after the
table name in the CREATE PUBLICATION such as:CREATE PUBLICATION foo FOR TABLE departments WHERE (id > 2000 AND id <=
3000);Row that doesn't match the WHERE clause will not be sent to the
subscribers.Patches 0001 and 0002 are only refactors and can be applied
independently. 0003 doesn't include row filtering on initial
synchronization.
Good idea. I haven't read this yet, but one thing to make sure you've
handled is limiting the clause to referencing only the current tuple and
the catalogs. user-catalog tables are OK, too, anything that is
RelationIsAccessibleInLogicalDecoding().
This means only immutable functions may be invoked, since a stable or
volatile function might attempt to access a table. And views must be
prohibited or recursively checked. (We have tree walkers that would help
with this).
It might be worth looking at the current logic for CHECK expressions, since
the requirements are similar. In my opinion you could safely not bother
with allowing access to user catalog tables in the filter expressions and
limit them strictly to immutable functions and the tuple its self.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2018-03-01 00:03, Euler Taveira wrote:
The attached patches add support for filtering rows in the publisher.
001-Refactor-function-create_estate_for_relation.patch
0002-Rename-a-WHERE-node.patch
0003-Row-filtering-for-logical-replication.patch
Comments?
Very, very useful. I really do hope this patch survives the
late-arrival-cull.
I built this functionality into a test program I have been using and in
simple cascading replication tests it works well.
I did find what I think is a bug (a bug easy to avoid but also easy to
run into):
The test I used was to cascade 3 instances (all on one machine) from
A->B->C
I ran a pgbench session in instance A, and used:
in A: alter publication pub0_6515 add table pgbench_accounts where
(aid between 40000 and 60000-1);
in B: alter publication pub1_6516 add table pgbench_accounts;
The above worked well, but when I did the same but used the filter in
both publications:
in A: alter publication pub0_6515 add table pgbench_accounts where
(aid between 40000 and 60000-1);
in B: alter publication pub1_6516 add table pgbench_accounts where
(aid between 40000 and 60000-1);
then the replication only worked for (pgbench-)scale 1 (hence: very
little data); with larger scales it became slow (taking many minutes
where the above had taken less than 1 minute), and ended up using far
too much memory (or blowing up/crashing altogether). Something not
quite right there.
Nevertheless, I am much in favour of acquiring this functionality as
soon as possible.
Thanks,
Erik Rijkers
2018-02-28 21:47 GMT-03:00 David Fetter <david@fetter.org>:
I noticed that the WHERE clause applies to all tables in the
publication. Is that actually the right thing? I'm thinking of a
case where we have foo(id, ...) and bar(foo_id, ....). To slice that
correctly, we'd want to do the ids in the foo table and the foo_ids in
the bar table. In the system as written, that would entail, at least
potentially, writing a lot of publications by hand.
I didn't make it clear in my previous email and I think you misread
the attached docs. Each table can have an optional WHERE clause. I'll
made it clear when I rewrite the tests. Something like:
CREATE PUBLICATION tap_pub FOR TABLE tab_rowfilter_1 WHERE (a > 1000
AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0),
tab_rowfilter_3;
Such syntax will not block another future feature that will publish
only few columns of the table.
I also noticed that in psql, \dRp+ doesn't show the WHERE clause,
which it probably should.
Yea, it could be added be I'm afraid of such long WHERE clauses.
Does it need regression tests?
I included some tests just to demonstrate the feature but I'm planning
to add a separate test file for it.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On Thu, Mar 01, 2018 at 12:41:04PM -0300, Euler Taveira wrote:
2018-02-28 21:47 GMT-03:00 David Fetter <david@fetter.org>:
I noticed that the WHERE clause applies to all tables in the
publication. Is that actually the right thing? I'm thinking of a
case where we have foo(id, ...) and bar(foo_id, ....). To slice that
correctly, we'd want to do the ids in the foo table and the foo_ids in
the bar table. In the system as written, that would entail, at least
potentially, writing a lot of publications by hand.I didn't make it clear in my previous email and I think you misread
the attached docs. Each table can have an optional WHERE clause. I'll
made it clear when I rewrite the tests. Something like:
Sorry I misunderstood.
CREATE PUBLICATION tap_pub FOR TABLE tab_rowfilter_1 WHERE (a > 1000
AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0),
tab_rowfilter_3;
That's great!
Such syntax will not block another future feature that will publish
only few columns of the table.I also noticed that in psql, \dRp+ doesn't show the WHERE clause,
which it probably should.Yea, it could be added be I'm afraid of such long WHERE clauses.
I think of + as signifying, "I am ready to get a LOT of output in
order to see more detail." Perhaps that's just me.
Does it need regression tests?
I included some tests just to demonstrate the feature but I'm
planning to add a separate test file for it.
Excellent. This feature looks like a nice big chunk of the user-space
infrastructure needed for sharding, among other things.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2018-03-01 16:27, Erik Rijkers wrote:
On 2018-03-01 00:03, Euler Taveira wrote:
The attached patches add support for filtering rows in the publisher.
001-Refactor-function-create_estate_for_relation.patch
0002-Rename-a-WHERE-node.patch
0003-Row-filtering-for-logical-replication.patchComments?
Very, very useful. I really do hope this patch survives the
late-arrival-cull.I built this functionality into a test program I have been using and
in simple cascading replication tests it works well.I did find what I think is a bug (a bug easy to avoid but also easy to
run into):
The test I used was to cascade 3 instances (all on one machine) from
A->B->C
I ran a pgbench session in instance A, and used:
in A: alter publication pub0_6515 add table pgbench_accounts where
(aid between 40000 and 60000-1);
in B: alter publication pub1_6516 add table pgbench_accounts;The above worked well, but when I did the same but used the filter in
both publications:
in A: alter publication pub0_6515 add table pgbench_accounts where
(aid between 40000 and 60000-1);
in B: alter publication pub1_6516 add table pgbench_accounts where
(aid between 40000 and 60000-1);then the replication only worked for (pgbench-)scale 1 (hence: very
little data); with larger scales it became slow (taking many minutes
where the above had taken less than 1 minute), and ended up using far
too much memory (or blowing up/crashing altogether). Something not
quite right there.Nevertheless, I am much in favour of acquiring this functionality as
soon as possible.
Attached is 'logrep_rowfilter.sh', a demonstration of above-described
bug.
The program runs initdb for 3 instances in /tmp (using ports 6515, 6516,
and 6517) and sets up logical replication from 1->2->3.
It can be made to work by removing de where-clause on the second 'create
publication' ( i.e., outcomment the $where2 variable ).
Show quoted text
Thanks,
Erik Rijkers
Attachments:
Hi,
On 2018-03-01 16:27:11 +0100, Erik Rijkers wrote:
Very, very useful. I really do hope this patch survives the
late-arrival-cull.
FWIW, I don't think it'd be fair or prudent. There's definitely some
issues (see e.g. Craig's reply), and I don't see why this patch'd
deserve an exemption from the "nontrivial patches shouldn't be submitted
to the last CF" policy?
- Andres
Hi,
On 3/1/18 4:27 PM, Andres Freund wrote:
On 2018-03-01 16:27:11 +0100, Erik Rijkers wrote:
Very, very useful. I really do hope this patch survives the
late-arrival-cull.FWIW, I don't think it'd be fair or prudent. There's definitely some
issues (see e.g. Craig's reply), and I don't see why this patch'd
deserve an exemption from the "nontrivial patches shouldn't be submitted
to the last CF" policy?
I'm unable to find this in the CF under the title or author name. If it
didn't get entered then it is definitely out.
If it does have an entry, then I agree with Andres that it should be
pushed to the next CF.
--
-David
david@pgmasters.net
2018-03-01 18:27 GMT-03:00 Andres Freund <andres@anarazel.de>:
FWIW, I don't think it'd be fair or prudent. There's definitely some
issues (see e.g. Craig's reply), and I don't see why this patch'd
deserve an exemption from the "nontrivial patches shouldn't be submitted
to the last CF" policy?
I forgot to mention but this feature is for v12. I know the rules and
that is why I didn't add it to the in progress CF.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
2018-03-01 18:25 GMT-03:00 Erik Rijkers <er@xs4all.nl>:
Attached is 'logrep_rowfilter.sh', a demonstration of above-described bug.
Thanks for testing. I will figure out what is happening. There are
some leaks around. I'll post another version when I fix some of those
bugs.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
2018-02-28 21:54 GMT-03:00 Craig Ringer <craig@2ndquadrant.com>:
Good idea. I haven't read this yet, but one thing to make sure you've
handled is limiting the clause to referencing only the current tuple and the
catalogs. user-catalog tables are OK, too, anything that is
RelationIsAccessibleInLogicalDecoding().This means only immutable functions may be invoked, since a stable or
volatile function might attempt to access a table. And views must be
prohibited or recursively checked. (We have tree walkers that would help
with this).It might be worth looking at the current logic for CHECK expressions, since
the requirements are similar. In my opinion you could safely not bother with
allowing access to user catalog tables in the filter expressions and limit
them strictly to immutable functions and the tuple its self.
IIRC implementation is similar to RLS expressions. I'll check all of
these rules.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On 3/1/18 6:00 PM, Euler Taveira wrote:
2018-03-01 18:27 GMT-03:00 Andres Freund <andres@anarazel.de>:
FWIW, I don't think it'd be fair or prudent. There's definitely some
issues (see e.g. Craig's reply), and I don't see why this patch'd
deserve an exemption from the "nontrivial patches shouldn't be submitted
to the last CF" policy?I forgot to mention but this feature is for v12. I know the rules and
that is why I didn't add it to the in progress CF.
That was the right thing to do, thank you!
--
-David
david@pgmasters.net
On Thu, Mar 01, 2018 at 06:16:17PM -0500, David Steele wrote:
That was the right thing to do, thank you!
This patch has been waiting on author for a couple of months and does
not apply anymore, so I am marking as returned with feedback. If you
can rebase, please feel free to resubmit.
--
Michael
Em qua, 28 de fev de 2018 às 20:03, Euler Taveira
<euler@timbira.com.br> escreveu:
The attached patches add support for filtering rows in the publisher.
I rebased the patch. I added row filtering for initial
synchronization, pg_dump support and psql support. 0001 removes unused
code. 0002 reduces memory use. 0003 passes only structure member that
is used in create_estate_for_relation. 0004 reuses a parser node for
row filtering. 0005 is the feature. 0006 prints WHERE expression in
psql. 0007 adds pg_dump support. 0008 is only for debug purposes (I'm
not sure some of these messages will be part of the final patch).
0001, 0002, 0003 and 0008 are not mandatory for this feature.
Comments?
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachments:
0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchDownload
From b2e56eaa9e16246c8158ff2961a6a4b2acbd096b Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 18:39:22 +0000
Subject: [PATCH 1/8] Remove unused atttypmod column from initial table
synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 6e420d8..f285813 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -660,7 +660,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
- Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -704,7 +704,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -714,7 +713,7 @@ fetch_remote_table_info(char *nspname, char *relname,
" AND a.attrelid = %u"
" ORDER BY a.attnum",
lrel->remoteid, lrel->remoteid);
- res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+ res = walrcv_exec(wrconn, cmd.data, 3, attrRow);
if (res->status != WALRCV_OK_TUPLES)
ereport(ERROR,
@@ -735,7 +734,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.7.4
0002-Store-number-of-tuples-in-WalRcvExecResult.patchtext/x-patch; charset=US-ASCII; name=0002-Store-number-of-tuples-in-WalRcvExecResult.patchDownload
From 797a0e8d858b490df7a9e1526f76e49fe1e10215 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 17:37:36 +0000
Subject: [PATCH 2/8] Store number of tuples in WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
While in it, since we have the number of columns, allocate only nfields
for cstrs instead of MaxTupleAttributeNumber.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 9 ++++++---
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 1e1695e..2533e3a 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -865,6 +865,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -875,7 +876,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -884,15 +885,17 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
- char *cstrs[MaxTupleAttributeNumber];
+ char **cstrs;
CHECK_FOR_INTERRUPTS();
/* Do the allocations in temporary context. */
oldcontext = MemoryContextSwitchTo(rowcontext);
+ cstrs = palloc(nfields * sizeof(char *));
+
/*
* Fill cstrs with null-terminated strings of column values.
*/
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f285813..e119781 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -720,9 +720,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 5913b58..62f63f9 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -197,6 +197,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* libpqwalreceiver hooks */
--
2.7.4
0003-Refactor-function-create_estate_for_relation.patchtext/x-patch; charset=US-ASCII; name=0003-Refactor-function-create_estate_for_relation.patchDownload
From 6dd5414b8dcf7b94b0901c9dfbd50d68a4c33ba1 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH 3/8] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 277da69..fa2f0ad 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -187,7 +187,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -197,13 +197,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -607,7 +607,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel));
@@ -713,7 +713,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel));
localslot = ExecInitExtraTupleSlot(estate,
@@ -831,7 +831,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel));
localslot = ExecInitExtraTupleSlot(estate,
--
2.7.4
0004-Rename-a-WHERE-node.patchtext/x-patch; charset=US-ASCII; name=0004-Rename-a-WHERE-node.patchDownload
From 848bc00f5e5c7b16c768eca2055a56aaef579817 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6d23bfb..756f0dd 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -470,7 +470,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3742,7 +3742,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3844,7 +3844,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.7.4
0005-Row-filtering-for-logical-replication.patchtext/x-patch; charset=US-ASCII; name=0005-Row-filtering-for-logical-replication.patchDownload
From 80f710ffe42329d321e803cffc45314f35eda6c2 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH 5/8] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/ref/alter_publication.sgml | 9 ++-
doc/src/sgml/ref/create_publication.sgml | 14 +++-
src/backend/catalog/pg_publication.c | 46 +++++++++--
src/backend/commands/publicationcmds.c | 69 +++++++++++-----
src/backend/parser/gram.y | 26 ++++--
src/backend/parser/parse_agg.c | 10 +++
src/backend/parser/parse_expr.c | 4 +
src/backend/parser/parse_func.c | 2 +
src/backend/replication/logical/proto.c | 2 +-
src/backend/replication/logical/relation.c | 13 +++
src/backend/replication/logical/tablesync.c | 119 +++++++++++++++++++++++++---
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 97 ++++++++++++++++++++++-
src/include/catalog/pg_publication.h | 8 +-
src/include/catalog/pg_publication_rel.h | 8 +-
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 ++-
src/include/parser/parse_node.h | 3 +-
src/include/replication/logicalproto.h | 2 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/misc_sanity.out | 3 +-
src/test/subscription/t/010_row_filter.pl | 97 +++++++++++++++++++++++
22 files changed, 492 insertions(+), 56 deletions(-)
create mode 100644 src/test/subscription/t/010_row_filter.pl
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598..5984915 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,10 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the expression.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca..d5fed30 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">expression</replaceable> will not be
+ published. Note that parentheses are required around the expression.
</para>
<para>
@@ -184,6 +187,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 3ecf6d5..f9f18a6 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,10 @@
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -142,18 +146,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = heap_open(PublicationRelRelationId, RowExclusiveLock);
@@ -173,10 +180,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,6 +211,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add row filter, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prrowfilter - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prrowfilter - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -203,11 +233,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the row filter expression */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
heap_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 6f7762a..d4fca7f 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -338,6 +338,27 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid the
+ * WHERE clause in it. The use of relation_expr_list node just for the
+ * DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -359,9 +380,9 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ PublicationRelationQual *newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == oldrelid)
{
found = true;
break;
@@ -370,7 +391,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
if (!found)
{
- Relation oldrel = heap_open(oldrelid,
+ PublicationRelationQual *oldrel = palloc(sizeof(PublicationRelationQual));
+ oldrel->relation = heap_open(oldrelid,
ShareUpdateExclusiveLock);
delrels = lappend(delrels, oldrel);
@@ -493,16 +515,18 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst(lc);
- Relation rel;
- bool recurse = rv->inh;
- Oid myrelid;
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = t->relation;
+ Relation rel;
+ bool recurse = rv->inh;
+ Oid myrelid;
CHECK_FOR_INTERRUPTS();
@@ -521,7 +545,10 @@ OpenTableList(List *tables)
heap_close(rel, ShareUpdateExclusiveLock);
continue;
}
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
if (recurse)
@@ -551,7 +578,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = heap_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -572,10 +603,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- heap_close(rel, NoLock);
+ heap_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -591,13 +624,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -623,8 +656,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
ObjectIdGetDatum(pubid));
@@ -636,7 +669,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 756f0dd..edf1fef 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -398,13 +398,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9526,7 +9526,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9557,7 +9557,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9565,7 +9565,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9573,7 +9573,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9583,6 +9583,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 61727e1..128d0b9 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -522,6 +522,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in CALL arguments");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -902,6 +909,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CALL_ARGUMENT:
err = _("window functions are not allowed in CALL arguments");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 385e54a..55cc385 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1848,6 +1848,8 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
break;
/*
@@ -3475,6 +3477,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "PARTITION BY";
case EXPR_KIND_CALL_ARGUMENT:
return "CALL";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 4425715..e997ea0 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2369,6 +2369,8 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("set-returning functions are not allowed in CALL arguments");
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
break;
/*
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 1945171..7ce9378 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -378,7 +378,7 @@ logicalrep_write_rel(StringInfo out, Relation rel)
LogicalRepRelation *
logicalrep_read_rel(StringInfo in)
{
- LogicalRepRelation *rel = palloc(sizeof(LogicalRepRelation));
+ LogicalRepRelation *rel = palloc0(sizeof(LogicalRepRelation));
rel->remoteid = pq_getmsgint(in, 4);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 1f20df5..8cbb394 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -140,6 +140,16 @@ logicalrep_relmap_free_entry(LogicalRepRelMapEntry *entry)
}
bms_free(remoterel->attkeys);
+ if (remoterel->nrowfilters > 0)
+ {
+ int i;
+
+ for (i = 0; i < remoterel->nrowfilters; i++)
+ pfree(remoterel->rowfiltercond[i]);
+
+ pfree(remoterel->rowfiltercond);
+ }
+
if (entry->attrmap)
pfree(entry->attrmap);
}
@@ -187,6 +197,9 @@ logicalrep_relmap_update(LogicalRepRelation *remoterel)
}
entry->remoterel.replident = remoterel->replident;
entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+ entry->remoterel.rowfiltercond = palloc(remoterel->nrowfilters * sizeof(char *));
+ for (i = 0; i < remoterel->nrowfilters; i++)
+ entry->remoterel.rowfiltercond[i] = pstrdup(remoterel->rowfiltercond[i]);
MemoryContextSwitchTo(oldctx);
}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e119781..fa7c111 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -661,8 +661,14 @@ fetch_remote_table_info(char *nspname, char *relname,
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
+ Oid rowfilterRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -724,20 +730,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -745,7 +751,54 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Fetch row filtering info */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd, "SELECT pg_get_expr(prrowfilter, prrelid) FROM pg_publication p INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) WHERE pr.prrelid = %u AND p.pubname IN (", MyLogicalRepWorker->relid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, rowfilterRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch row filter info for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ lrel->rowfiltercond = palloc0(res->ntuples * sizeof(char *));
+
+ n = 0;
+ slot = MakeSingleTupleTableSlot(res->tupledesc);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ {
+ char *p = TextDatumGetCString(rf);
+ lrel->rowfiltercond[n++] = p;
+ }
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ lrel->nrowfilters = n;
walrcv_clear_result(res);
pfree(cmd.data);
@@ -778,10 +831,57 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- appendStringInfo(&cmd, "COPY %s TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /*
+ * If publication has any row filter, build a SELECT query with OR'ed row
+ * filters for COPY.
+ * If no row filters are available, use COPY for all
+ * table contents.
+ */
+ if (lrel.nrowfilters > 0)
+ {
+ ListCell *lc;
+ bool first;
+ int i;
+
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
+ appendStringInfo(&cmd, " FROM %s",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfoString(&cmd, " WHERE ");
+ /* list of OR'ed filters */
+ first = true;
+ for (i = 0; i < lrel.nrowfilters; i++)
+ {
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", lrel.rowfiltercond[i]);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
+ }
+ else
+ {
+ appendStringInfo(&cmd, "COPY %s TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ }
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -796,7 +896,6 @@ copy_table(Relation rel)
addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index fa2f0ad..f28a74f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -186,7 +186,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 86e0951..1f4a3d3 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,13 +12,23 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
#include "utils/memutils.h"
@@ -58,6 +68,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *row_filter;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -329,6 +340,63 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->row_filter) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach (lc, relentry->row_filter)
+ {
+ Node *row_filter;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ row_filter = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(row_filter);
+ expr = (Expr *) coerce_to_target_type(NULL, row_filter, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ MemoryContextSwitchTo(oldcxt);
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+ return;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -564,10 +632,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->row_filter = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
if (pub->alltables || list_member_oid(pubids, pub->oid))
{
@@ -577,9 +649,23 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prrowfilter, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+ entry->row_filter = lappend(entry->row_filter, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -654,5 +740,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->row_filter) > 0)
+ list_free(entry->row_filter);
+ entry->row_filter = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index a5d5570..e78222e 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -76,6 +76,12 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -84,7 +90,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index d97b0fe..f499253 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -28,8 +28,12 @@
*/
CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
{
- Oid prpubid; /* Oid of the publication */
- Oid prrelid; /* Oid of the relation */
+ Oid prpubid; /* Oid of the publication */
+ Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prrowfilter; /* nodeToString representation of row filter */
+#endif
} FormData_pg_publication_rel;
/* ----------------
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cac6ff0..26b79d7 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -475,6 +475,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aa4a0db..8ac4d81 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3448,12 +3448,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3466,7 +3473,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 0230543..8e3c735 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -69,7 +69,8 @@ typedef enum ParseExprKind
EXPR_KIND_TRIGGER_WHEN, /* WHEN condition in CREATE TRIGGER */
EXPR_KIND_POLICY, /* USING or WITH CHECK expr in policy */
EXPR_KIND_PARTITION_EXPRESSION, /* PARTITION BY expression */
- EXPR_KIND_CALL_ARGUMENT /* procedure argument in CALL */
+ EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 8192f79..75be2f0 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -50,6 +50,8 @@ typedef struct LogicalRepRelation
Oid *atttyps; /* column types */
char replident; /* replica identity */
Bitmapset *attkeys; /* Bitmap of key columns */
+ char **rowfiltercond; /* condition for row filtering */
+ int nrowfilters; /* number of row filters */
} LogicalRepRelation;
/* Type mapping info */
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 73e4805..dd54295 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index 2d3522b..62eabbd 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -105,5 +105,6 @@ ORDER BY 1, 2;
pg_index | indpred | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(11 rows)
+ pg_publication_rel | prrowfilter | pg_node_tree
+(12 rows)
diff --git a/src/test/subscription/t/010_row_filter.pl b/src/test/subscription/t/010_row_filter.pl
new file mode 100644
index 0000000..6c174fa
--- /dev/null
+++ b/src/test/subscription/t/010_row_filter.pl
@@ -0,0 +1,97 @@
+# Teste logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+
+my $result = $node_publisher->psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 DROP TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+is($result, 3, "syntax error for ALTER PUBLICATION DROP TABLE");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 2 = 0)");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3");
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)");
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1003) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 10)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+#$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter_1");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002
+1003|test 1003), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(7|2|10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.7.4
0006-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; charset=US-ASCII; name=0006-Print-publication-WHERE-condition-in-psql.patchDownload
From 7bfdda04dbfdb7acba84296bdf15fe49bafe2d7c Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Thu, 17 May 2018 20:52:28 +0000
Subject: [PATCH 6/8] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4ca0db1..7aab8b9 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5445,7 +5445,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prrowfilter, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5475,6 +5476,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.7.4
0007-Publication-where-condition-support-for-pg_dump.patchtext/x-patch; charset=US-ASCII; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From be36719198f6ad3f90164510601b35118870c389 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Sat, 15 Sep 2018 02:52:00 +0000
Subject: [PATCH 7/8] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c8d01ed..4c63694 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3911,6 +3911,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -3944,7 +3945,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prrowfilter, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -3965,6 +3967,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -3980,6 +3983,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4008,8 +4016,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as drop query as the drop is
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 685ad78..c2dfae6 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -609,6 +609,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.7.4
0008-Debug-for-row-filtering.patchtext/x-patch; charset=US-ASCII; name=0008-Debug-for-row-filtering.patchDownload
From f1571f8b607333efe3f4a70b5dc73dd069e89573 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 14 Mar 2018 00:53:17 +0000
Subject: [PATCH 8/8] Debug for row filtering
---
src/backend/commands/publicationcmds.c | 11 +++++
src/backend/replication/logical/tablesync.c | 1 +
src/backend/replication/pgoutput/pgoutput.c | 66 +++++++++++++++++++++++++++++
3 files changed, 78 insertions(+)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index d4fca7f..27f1102 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -327,6 +327,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Oid pubid = HeapTupleGetOid(tup);
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
+ ListCell *xpto;
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -338,6 +339,16 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ foreach(xpto, stmt->tables)
+ {
+ PublicationTable *t = lfirst(xpto);
+
+ if (t->whereClause == NULL)
+ elog(DEBUG3, "publication \"%s\" has no WHERE clause", NameStr(pubform->pubname));
+ else
+ elog(DEBUG3, "publication \"%s\" has WHERE clause", NameStr(pubform->pubname));
+ }
+
/*
* ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
* publication_table_list node (that accepts a WHERE clause) but forbid the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index fa7c111..e0eb73c 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -882,6 +882,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
}
+ elog(DEBUG2, "COPY for initial synchronization: %s", cmd.data);
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1f4a3d3..c7f0e32 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -31,6 +31,7 @@
#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -316,6 +317,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
MemoryContext old;
RelationSyncEntry *relentry;
+ Form_pg_class class_form;
+ char *schemaname;
+ char *tablename;
+
if (!is_publishable_relation(relation))
return;
@@ -340,6 +345,17 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ class_form = RelationGetForm(relation);
+ schemaname = get_namespace_name(class_form->relnamespace);
+ tablename = NameStr(class_form->relname);
+
+ if (change->action == REORDER_BUFFER_CHANGE_INSERT)
+ elog(DEBUG1, "INSERT \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_UPDATE)
+ elog(DEBUG1, "UPDATE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_DELETE)
+ elog(DEBUG1, "DELETE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+
/* ... then check row filter */
if (list_length(relentry->row_filter) > 0)
{
@@ -356,6 +372,42 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
+#ifdef _NOT_USED
+ if (old_tuple)
+ {
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute attr;
+ HeapTuple type_tuple;
+ Oid typoutput;
+ bool typisvarlena;
+ bool isnull;
+ Datum val;
+ char *outputstr = NULL;
+
+ attr = TupleDescAttr(tupdesc, i);
+
+ /* Figure out type name */
+ type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
+ if (HeapTupleIsValid(type_tuple))
+ {
+ /* Get information needed for printing values of a type */
+ getTypeOutputInfo(attr->atttypid, &typoutput, &typisvarlena);
+
+ val = heap_getattr(old_tuple, i + 1, tupdesc, &isnull);
+ if (!isnull)
+ {
+ outputstr = OidOutputFunctionCall(typoutput, val);
+ elog(DEBUG2, "row filter: REPLICA IDENTITY %s: %s", NameStr(attr->attname), outputstr);
+ pfree(outputstr);
+ }
+ }
+ }
+ }
+#endif
+
/* prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -371,6 +423,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Oid expr_type;
Datum res;
bool isnull;
+ char *s = NULL;
row_filter = (Node *) lfirst(lc);
@@ -381,14 +434,24 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
expr_state = ExecInitExpr(expr, NULL);
res = ExecEvalExpr(expr_state, ecxt, &isnull);
+ elog(DEBUG3, "row filter: result: %s ; isnull: %s", (DatumGetBool(res)) ? "true" : "false", (isnull) ? "true" : "false");
+
/* if tuple does not match row filter, bail out */
if (!DatumGetBool(res) || isnull)
{
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(row_filter)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was not matched", s);
+ pfree(s);
+
MemoryContextSwitchTo(oldcxt);
ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
FreeExecutorState(estate);
return;
}
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(row_filter)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was matched", s);
+ pfree(s);
}
MemoryContextSwitchTo(oldcxt);
@@ -659,9 +722,12 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
char *s = TextDatumGetCString(rf_datum);
+ char *t = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, rf_datum, ObjectIdGetDatum(entry->relid)));
Node *rf_node = stringToNode(s);
entry->row_filter = lappend(entry->row_filter, rf_node);
MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", t, pub->name, get_rel_name(relid));
}
ReleaseSysCache(rf_tuple);
--
2.7.4
On 2018-11-01 01:29, Euler Taveira wrote:
Em qua, 28 de fev de 2018 às 20:03, Euler Taveira
<euler@timbira.com.br> escreveu:The attached patches add support for filtering rows in the publisher.
I ran pgbench-over-logical-replication with a WHERE-clause and could not
get this to do a correct replication. Below is the output of the
attached test program.
$ ./logrep_rowfilter.sh
--
/home/aardvark/pg_stuff/pg_installations/pgsql.logrep_rowfilter/bin.fast/initdb
--pgdata=/tmp/cascade/instance1/data --encoding=UTF8 --pwfile=/tmp/bugs
--
/home/aardvark/pg_stuff/pg_installations/pgsql.logrep_rowfilter/bin.fast/initdb
--pgdata=/tmp/cascade/instance2/data --encoding=UTF8 --pwfile=/tmp/bugs
--
/home/aardvark/pg_stuff/pg_installations/pgsql.logrep_rowfilter/bin.fast/initdb
--pgdata=/tmp/cascade/instance3/data --encoding=UTF8 --pwfile=/tmp/bugs
sleep 3s
dropping old tables...
creating tables...
generating data...
100000 of 100000 tuples (100%) done (elapsed 0.09 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done.
create publication pub_6515_to_6516;
alter publication pub_6515_to_6516 add table pgbench_accounts where (aid
between 40000 and 60000-1) ; --> where 1
alter publication pub_6515_to_6516 add table pgbench_branches;
alter publication pub_6515_to_6516 add table pgbench_tellers;
alter publication pub_6515_to_6516 add table pgbench_history;
create publication pub_6516_to_6517;
alter publication pub_6516_to_6517 add table pgbench_accounts ; -- where
(aid between 40000 and 60000-1) ; --> where 2
alter publication pub_6516_to_6517 add table pgbench_branches;
alter publication pub_6516_to_6517 add table pgbench_tellers;
alter publication pub_6516_to_6517 add table pgbench_history;
create subscription pub_6516_from_6515 connection 'port=6515
application_name=rowfilter'
publication pub_6515_to_6516 with(enabled=false);
alter subscription pub_6516_from_6515 enable;
create subscription pub_6517_from_6516 connection 'port=6516
application_name=rowfilter'
publication pub_6516_to_6517 with(enabled=false);
alter subscription pub_6517_from_6516 enable;
-- pgbench -p 6515 -c 16 -j 8 -T 5 -n postgres # scale 1
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 16
number of threads: 8
duration: 5 s
number of transactions actually processed: 80
latency average = 1178.106 ms
tps = 13.581120 (including connections establishing)
tps = 13.597443 (excluding connections establishing)
accounts branches tellers history
--------- --------- --------- ---------
6515 6546b1f0f 2d328ed28 7406473b0 7c1351523 e8c07347b
6516 6546b1f0f 2d328ed28 d41d8cd98 d41d8cd98 e7235f541
6517 f7c0791c8 d9c63e471 d41d8cd98 d41d8cd98 30892eea1 NOK
6515 6546b1f0f 2d328ed28 7406473b0 7c1351523 e8c07347b
6516 6546b1f0f 2d328ed28 7406473b0 5a54cf7c5 191ae1af3
6517 6546b1f0f 2d328ed28 7406473b0 5a54cf7c5 191ae1af3 NOK
6515 6546b1f0f 2d328ed28 7406473b0 7c1351523 e8c07347b
6516 6546b1f0f 2d328ed28 7406473b0 5a54cf7c5 191ae1af3
6517 6546b1f0f 2d328ed28 7406473b0 5a54cf7c5 191ae1af3 NOK
[...]
I let that run for 10 minutes or so but that pgbench_history table
md5-values (of ports 6516 and 6517) do not change anymore, which shows
that it is and remains different from the original pgbench_history table
in 6515.
When there is a where-clause this goes *always* wrong.
Without a where-clause all logical replication tests were OK. Perhaps
the error is not in our patch but something in logical replication.
Attached is the test program (will need some tweaking of PATHs,
PG-variables (PGPASSFILE) etc). This is the same program I used in
march when you first posted a version of this patch alhough the error is
different.
thanks,
Erik Rijkers
Attachments:
On 2018-11-01 08:56, Erik Rijkers wrote:
On 2018-11-01 01:29, Euler Taveira wrote:
Em qua, 28 de fev de 2018 às 20:03, Euler Taveira
<euler@timbira.com.br> escreveu:The attached patches add support for filtering rows in the publisher.
I ran pgbench-over-logical-replication with a WHERE-clause and could
not get this to do a correct replication. Below is the output of the
attached test program.$ ./logrep_rowfilter.sh
I have noticed that the failure to replicate correctly can be avoided by
putting a wait state of (on my machine) at least 3 seconds between the
setting up of the subscription and the start of pgbench. See the bash
program I attached in my previous mail. The bug can be avoided by a
'sleep 5' just before the start of the actual pgbench run.
So it seems this bug is due to some timing error in your patch (or
possibly in logical replication itself).
Erik Rijkers
Em qui, 1 de nov de 2018 às 05:30, Erik Rijkers <er@xs4all.nl> escreveu:
I ran pgbench-over-logical-replication with a WHERE-clause and could
not get this to do a correct replication. Below is the output of the
attached test program.$ ./logrep_rowfilter.sh
Erik, thanks for testing.
So it seems this bug is due to some timing error in your patch (or
possibly in logical replication itself).
It is a bug in the new synchronization code. I'm doing some code
cleanup/review and will post a new patchset after I finish it. If you
want to give it a try again, apply the following patch.
diff --git a/src/backend/replication/logical/tablesync.c
b/src/backend/replication/logical/tablesync.c
index e0eb73c..4797e0b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -757,7 +757,7 @@ fetch_remote_table_info(char *nspname, char *relname,
/* Fetch row filtering info */
resetStringInfo(&cmd);
- appendStringInfo(&cmd, "SELECT pg_get_expr(prrowfilter,
prrelid) FROM pg_publication p INNER JOIN pg_publication_rel pr ON
(p.oid = pr.prpubid) WHERE pr.prrelid = %u AND p.pubname IN (",
MyLogicalRepWorker->relid);
+ appendStringInfo(&cmd, "SELECT pg_get_expr(prrowfilter,
prrelid) FROM pg_publication p INNER JOIN pg_publication_rel pr ON
(p.oid = pr.prpubid) WHERE pr.prrelid = %u AND p.pubname IN (",
lrel->remoteid);
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On 2018-11-02 02:59, Euler Taveira wrote:
Em qui, 1 de nov de 2018 às 05:30, Erik Rijkers <er@xs4all.nl>
escreveu:I ran pgbench-over-logical-replication with a WHERE-clause and could
not get this to do a correct replication. Below is the output of the
attached test program.$ ./logrep_rowfilter.sh
Erik, thanks for testing.
So it seems this bug is due to some timing error in your patch (or
possibly in logical replication itself).It is a bug in the new synchronization code. I'm doing some code
cleanup/review and will post a new patchset after I finish it. If you
want to give it a try again, apply the following patch.diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index e0eb73c..4797e0b 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c [...]
That does indeed fix it.
Thank you,
Erik Rijkers
On 2018/11/01 0:29, Euler Taveira wrote:
Em qua, 28 de fev de 2018 às 20:03, Euler Taveira
<euler@timbira.com.br> escreveu:The attached patches add support for filtering rows in the publisher.
I rebased the patch. I added row filtering for initial
synchronization, pg_dump support and psql support. 0001 removes unused
code. 0002 reduces memory use. 0003 passes only structure member that
is used in create_estate_for_relation. 0004 reuses a parser node for
row filtering. 0005 is the feature. 0006 prints WHERE expression in
psql. 0007 adds pg_dump support. 0008 is only for debug purposes (I'm
not sure some of these messages will be part of the final patch).
0001, 0002, 0003 and 0008 are not mandatory for this feature.Comments?
Hi,
I reviewed your patches and I found a bug when I tested ALTER
PUBLICATION statement.
In short, ALTER PUBLICATION SET with a WHERE clause does not applied new
WHERE clause.
I describe the outline of the test I did and my conclusion.
[TEST]
I show the test case I tried in below.
(1)Publisher and Subscriber
I executed each statement on the publisher and the subscriber.
```
testdb=# CREATE PUBLICATION pub_testdb_t FOR TABLE t WHERE (id > 10);
CREATE PUBLICATION
```
```
testdb=# CREATE SUBSCRIPTION sub_testdb_t CONNECTION 'dbname=testdb
port=5432 user=postgres' PUBLICATION pub_testdb_t;
NOTICE: created replication slot "sub_testdb_t" on publisher
CREATE SUBSCRIPTION
```
(2)Publisher
I executed these statements shown below.
testdb=# INSERT INTO t VALUES (1,1);
INSERT 0 1
testdb=# INSERT INTO t VALUES (11,11);
INSERT 0 1
(3)Subscriber
I confirmed that the CREATE PUBLICATION statement worked well.
```
testdb=# SELECT * FROM t;
id | data
----+------
11 | 11
(1 row)
```
(4)Publisher
After that, I executed ALTER PUBLICATION with a WHERE clause and
inserted a new row.
```
testdb=# ALTER PUBLICATION pub_testdb_t SET TABLE t WHERE (id > 5);
ALTER PUBLICATION
testdb=# INSERT INTO t VALUES (7,7);
INSERT 0 1
testdb=# SELECT * FROM t;
id | data
----+------
1 | 1
11 | 11
7 | 7
(3 rows)
```
(5)Subscriber
I confirmed that the change of WHERE clause set by ALTER PUBLICATION
statement was ignored.
```
testdb=# SELECT * FROM t;
id | data
----+------
11 | 11
(1 row)
```
[Conclusion]
I think AlterPublicationTables()@publicationcmds.c has a bug.
In the foreach(oldlc, oldrelids) loop, oldrel must be appended to
delrels if oldrel or newrel has a WHERE clause. However, the current
implementation does not, therefore, old WHERE clause is not deleted and
the new WHERE clause is ignored.
This is my speculation. It may not be correct, but , at least, it is a
fact that ALTER PUBLICATION with a WHERE clause is not functioned in my
environment and my operation described in above.
Best regards,
On 01/11/2018 01:29, Euler Taveira wrote:
Em qua, 28 de fev de 2018 às 20:03, Euler Taveira
<euler@timbira.com.br> escreveu:The attached patches add support for filtering rows in the publisher.
I rebased the patch. I added row filtering for initial
synchronization, pg_dump support and psql support. 0001 removes unused
code. 0002 reduces memory use. 0003 passes only structure member that
is used in create_estate_for_relation. 0004 reuses a parser node for
row filtering. 0005 is the feature. 0006 prints WHERE expression in
psql. 0007 adds pg_dump support. 0008 is only for debug purposes (I'm
not sure some of these messages will be part of the final patch).
0001, 0002, 0003 and 0008 are not mandatory for this feature.Comments?
Hi,
I think there are two main topics that still need to be discussed about
this patch.
Firstly, I am not sure if it's wise to allow UDFs in the filter clause
for the table. The reason for that is that we can't record all necessary
dependencies there because the functions are black box for parser. That
means if somebody drops object that an UDF used in replication filter
depends on, that function will start failing. But unlike for user
sessions it will start failing during decoding (well processing in
output plugin). And that's not recoverable by reading the missing
object, the only way to get out of that is either to move slot forward
which means losing part of replication stream and need for manual resync
or full rebuild of replication. Neither of which are good IMHO.
Secondly, do we want to at least notify user on filters (or maybe even
disallow them) with combination of action + column where column value
will not be logged? I mean for example we do this when processing the
filter against a row:
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
But if user has expression on column which is not part of replica
identity that expression will always return NULL for DELETEs because
only replica identity is logged with actual values and everything else
in NULL in old_tuple. So if publication replicates deletes we should
check for this somehow.
Btw about code (you already fixed the wrong reloid in sync so skipping
that).
0002:
+ for (tupn = 0; tupn < walres->ntuples; tupn++) { - char *cstrs[MaxTupleAttributeNumber]; + char **cstrs;CHECK_FOR_INTERRUPTS();
/* Do the allocations in temporary context. */
oldcontext = MemoryContextSwitchTo(rowcontext);+ cstrs = palloc(nfields * sizeof(char *));
Not really sure that this is actually worth it given that we have to
allocate and free this in a loop now while before it was just sitting on
a stack.
0005:
@@ -654,5 +740,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue) */ hash_seq_init(&status, RelationSyncCache); while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL) + { entry->replicate_valid = false; + if (list_length(entry->row_filter) > 0) + list_free(entry->row_filter); + entry->row_filter = NIL; + }
Won't this leak memory? The list_free only frees the list cells, but not
the nodes you stored there before.
Also I think we should document here that the expression is run with the
session environment of the replication connection (so that it's more
obvious that things like CURRENT_USER will not return user which changed
tuple but the replication user).
It would be nice if 0006 had regression test and 0007 TAP test.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Greetings,
* Euler Taveira (euler@timbira.com.br) wrote:
2018-02-28 21:54 GMT-03:00 Craig Ringer <craig@2ndquadrant.com>:
Good idea. I haven't read this yet, but one thing to make sure you've
handled is limiting the clause to referencing only the current tuple and the
catalogs. user-catalog tables are OK, too, anything that is
RelationIsAccessibleInLogicalDecoding().This means only immutable functions may be invoked, since a stable or
volatile function might attempt to access a table. And views must be
prohibited or recursively checked. (We have tree walkers that would help
with this).It might be worth looking at the current logic for CHECK expressions, since
the requirements are similar. In my opinion you could safely not bother with
allowing access to user catalog tables in the filter expressions and limit
them strictly to immutable functions and the tuple its self.IIRC implementation is similar to RLS expressions. I'll check all of
these rules.
Given the similarity to RLS and the nearby discussion about allowing
non-superusers to create subscriptions, and probably publications later,
I wonder if we shouldn't be somehow associating this with RLS policies
instead of having the publication filtering be entirely independent..
Thanks!
Stephen
On 23/11/2018 03:02, Stephen Frost wrote:
Greetings,
* Euler Taveira (euler@timbira.com.br) wrote:
2018-02-28 21:54 GMT-03:00 Craig Ringer <craig@2ndquadrant.com>:
Good idea. I haven't read this yet, but one thing to make sure you've
handled is limiting the clause to referencing only the current tuple and the
catalogs. user-catalog tables are OK, too, anything that is
RelationIsAccessibleInLogicalDecoding().This means only immutable functions may be invoked, since a stable or
volatile function might attempt to access a table. And views must be
prohibited or recursively checked. (We have tree walkers that would help
with this).It might be worth looking at the current logic for CHECK expressions, since
the requirements are similar. In my opinion you could safely not bother with
allowing access to user catalog tables in the filter expressions and limit
them strictly to immutable functions and the tuple its self.IIRC implementation is similar to RLS expressions. I'll check all of
these rules.Given the similarity to RLS and the nearby discussion about allowing
non-superusers to create subscriptions, and probably publications later,
I wonder if we shouldn't be somehow associating this with RLS policies
instead of having the publication filtering be entirely independent..
I do see the appeal here, if you consider logical replication to be a
streaming select it probably applies well.
But given that this is happening inside output plugin which does not
have full executor setup and has catalog-only snapshot I am not sure how
feasible it is to try to merge these two things. As per my previous
email it's possible that we'll have to be stricter about what we allow
in expressions here.
The other issue with merging this is that the use-case for filtering out
the data in logical replication is not necessarily about security, but
often about sending only relevant data. So it makes sense to have filter
on publication without RLS enabled on table and if we'd force that, we'd
limit usefulness of this feature.
We definitely want to eventually create subscriptions as non-superuser
but that has zero effect on this as everything here is happening on
different server than where subscription lives (we already allow
creation of publications with just CREATE privilege on database and
ownership of the table).
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Em sex, 23 de nov de 2018 às 11:40, Petr Jelinek
<petr.jelinek@2ndquadrant.com> escreveu:
But given that this is happening inside output plugin which does not
have full executor setup and has catalog-only snapshot I am not sure how
feasible it is to try to merge these two things. As per my previous
email it's possible that we'll have to be stricter about what we allow
in expressions here.
This feature should be as simple as possible. I don't want to
introduce a huge overhead just for filtering some data. Data sharding
generally uses simple expressions.
The other issue with merging this is that the use-case for filtering out
the data in logical replication is not necessarily about security, but
often about sending only relevant data. So it makes sense to have filter
on publication without RLS enabled on table and if we'd force that, we'd
limit usefulness of this feature.
Use the same infrastructure as RLS could be a good idea but use RLS
for row filtering is not. RLS is complex.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Em qui, 22 de nov de 2018 às 20:03, Petr Jelinek
<petr.jelinek@2ndquadrant.com> escreveu:
Firstly, I am not sure if it's wise to allow UDFs in the filter clause
for the table. The reason for that is that we can't record all necessary
dependencies there because the functions are black box for parser. That
means if somebody drops object that an UDF used in replication filter
depends on, that function will start failing. But unlike for user
sessions it will start failing during decoding (well processing in
output plugin). And that's not recoverable by reading the missing
object, the only way to get out of that is either to move slot forward
which means losing part of replication stream and need for manual resync
or full rebuild of replication. Neither of which are good IMHO.
It is a foot gun but there are several ways to do bad things in
postgres. CREATE PUBLICATION is restricted to superusers and role with
CREATE privilege in current database. AFAICS a role with CREATE
privilege cannot drop objects whose owner is not himself. I wouldn't
like to disallow UDFs in row filtering expressions just because
someone doesn't set permissions correctly. Do you have any other case
in mind?
Secondly, do we want to at least notify user on filters (or maybe even
disallow them) with combination of action + column where column value
will not be logged? I mean for example we do this when processing the
filter against a row:+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
We could emit a LOG message. That could possibly be an option but it
could be too complex for the first version.
But if user has expression on column which is not part of replica
identity that expression will always return NULL for DELETEs because
only replica identity is logged with actual values and everything else
in NULL in old_tuple. So if publication replicates deletes we should
check for this somehow.
In this case, we should document this behavior. That is a recurring
question in wal2json issues. Besides that we should explain that
UPDATE/DELETE tuples doesn't log all columns (people think the
behavior is equivalent to triggers; it is not unless you set REPLICA
IDENTITY FULL).
Not really sure that this is actually worth it given that we have to
allocate and free this in a loop now while before it was just sitting on
a stack.
That is a experimentation code that should be in a separate patch.
Don't you think low memory use is a good goal? I also think that
MaxTupleAttributeNumber is an extreme value. I didn't some preliminary
tests and didn't notice overheads. I'll leave these modifications in a
separate patch.
Won't this leak memory? The list_free only frees the list cells, but not
the nodes you stored there before.
Good catch. It should be list_free_deep.
Also I think we should document here that the expression is run with the
session environment of the replication connection (so that it's more
obvious that things like CURRENT_USER will not return user which changed
tuple but the replication user).
Sure.
It would be nice if 0006 had regression test and 0007 TAP test.
Sure.
Besides the problem presented by Hironobu-san, I'm doing some cleanup
and improving docs. I also forget to declare pg_publication_rel TOAST
table.
Thanks for your review.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On 2018-Nov-23, Euler Taveira wrote:
Em qui, 22 de nov de 2018 �s 20:03, Petr Jelinek
<petr.jelinek@2ndquadrant.com> escreveu:
Won't this leak memory? The list_free only frees the list cells, but not
the nodes you stored there before.Good catch. It should be list_free_deep.
Actually, if the nodes have more structure (say you palloc one list
item, but that list item also contains pointers to a Node) then a
list_free_deep won't be enough either. I'd suggest to create a bespoke
memory context, which you can delete afterwards.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Nov 23, 2018 at 12:03:27AM +0100, Petr Jelinek wrote:
On 01/11/2018 01:29, Euler Taveira wrote:
Em qua, 28 de fev de 2018 �s 20:03, Euler Taveira
<euler@timbira.com.br> escreveu:The attached patches add support for filtering rows in the publisher.
I rebased the patch. I added row filtering for initial
synchronization, pg_dump support and psql support. 0001 removes unused
code. 0002 reduces memory use. 0003 passes only structure member that
is used in create_estate_for_relation. 0004 reuses a parser node for
row filtering. 0005 is the feature. 0006 prints WHERE expression in
psql. 0007 adds pg_dump support. 0008 is only for debug purposes (I'm
not sure some of these messages will be part of the final patch).
0001, 0002, 0003 and 0008 are not mandatory for this feature.Hi,
I think there are two main topics that still need to be discussed about
this patch.Firstly, I am not sure if it's wise to allow UDFs in the filter clause
for the table. The reason for that is that we can't record all necessary
dependencies there because the functions are black box for parser.
Some UDFs are not a black box for the parser, namely ones written in
SQL. Would it make sense at least not to foreclose the non-(black box)
option?
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On 23/11/2018 17:39, David Fetter wrote:
On Fri, Nov 23, 2018 at 12:03:27AM +0100, Petr Jelinek wrote:
On 01/11/2018 01:29, Euler Taveira wrote:
Em qua, 28 de fev de 2018 às 20:03, Euler Taveira
<euler@timbira.com.br> escreveu:The attached patches add support for filtering rows in the publisher.
I rebased the patch. I added row filtering for initial
synchronization, pg_dump support and psql support. 0001 removes unused
code. 0002 reduces memory use. 0003 passes only structure member that
is used in create_estate_for_relation. 0004 reuses a parser node for
row filtering. 0005 is the feature. 0006 prints WHERE expression in
psql. 0007 adds pg_dump support. 0008 is only for debug purposes (I'm
not sure some of these messages will be part of the final patch).
0001, 0002, 0003 and 0008 are not mandatory for this feature.Hi,
I think there are two main topics that still need to be discussed about
this patch.Firstly, I am not sure if it's wise to allow UDFs in the filter clause
for the table. The reason for that is that we can't record all necessary
dependencies there because the functions are black box for parser.Some UDFs are not a black box for the parser, namely ones written in
SQL. Would it make sense at least not to foreclose the non-(black box)
option?
Yeah inlinable SQL functions should be fine, we just need the ability to
extract dependencies.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 23/11/2018 17:15, Euler Taveira wrote:
Em qui, 22 de nov de 2018 às 20:03, Petr Jelinek
<petr.jelinek@2ndquadrant.com> escreveu:Firstly, I am not sure if it's wise to allow UDFs in the filter clause
for the table. The reason for that is that we can't record all necessary
dependencies there because the functions are black box for parser. That
means if somebody drops object that an UDF used in replication filter
depends on, that function will start failing. But unlike for user
sessions it will start failing during decoding (well processing in
output plugin). And that's not recoverable by reading the missing
object, the only way to get out of that is either to move slot forward
which means losing part of replication stream and need for manual resync
or full rebuild of replication. Neither of which are good IMHO.It is a foot gun but there are several ways to do bad things in
postgres. CREATE PUBLICATION is restricted to superusers and role with
CREATE privilege in current database. AFAICS a role with CREATE
privilege cannot drop objects whose owner is not himself. I wouldn't
like to disallow UDFs in row filtering expressions just because
someone doesn't set permissions correctly. Do you have any other case
in mind?
I don't think this has anything to do with security. Stupid example:
user1: CREATE EXTENSION citext;
user2: CREATE FUNCTION myfilter(col1 text, col2 text) returns boolean
language plpgsql as
$$BEGIN
RETURN col1::citext = col2::citext;
END;$$
user2: ALTER PUBLICATION mypub ADD TABLE mytab WHERE (myfilter(a,b));
[... replication happening ...]
user1: DROP EXTENSION citext;
And now replication is broken and unrecoverable without data loss.
Recreating extension will not help because the changes happening in
meantime will not see it in the historical snapshot.
I don't think it's okay to do completely nothing about this.
Secondly, do we want to at least notify user on filters (or maybe even
disallow them) with combination of action + column where column value
will not be logged? I mean for example we do this when processing the
filter against a row:+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
We could emit a LOG message. That could possibly be an option but it
could be too complex for the first version.
Well, it needs walker which extracts Vars from the expression and checks
them against replica identity columns. We already have a way to fetch
replica identity columns and the walker could be something like
simplified version of the find_expr_references_walker used by the
recordDependencyOnSingleRelExpr (I don't think there is anything ready
made already).
But if user has expression on column which is not part of replica
identity that expression will always return NULL for DELETEs because
only replica identity is logged with actual values and everything else
in NULL in old_tuple. So if publication replicates deletes we should
check for this somehow.In this case, we should document this behavior. That is a recurring
question in wal2json issues. Besides that we should explain that
UPDATE/DELETE tuples doesn't log all columns (people think the
behavior is equivalent to triggers; it is not unless you set REPLICA
IDENTITY FULL).Not really sure that this is actually worth it given that we have to
allocate and free this in a loop now while before it was just sitting on
a stack.That is a experimentation code that should be in a separate patch.
Don't you think low memory use is a good goal? I also think that
MaxTupleAttributeNumber is an extreme value. I didn't some preliminary
tests and didn't notice overheads. I'll leave these modifications in a
separate patch.
It's static memory and it's a few KB of it (it's just single array of
pointers, not array of data, and does not depend on the number of rows).
Palloc will definitely need more CPU cycles.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Nov 23, 2018 at 3:55 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:
On 23/11/2018 17:15, Euler Taveira wrote:
Em qui, 22 de nov de 2018 às 20:03, Petr Jelinek
<petr.jelinek@2ndquadrant.com> escreveu:Firstly, I am not sure if it's wise to allow UDFs in the filter clause
for the table. The reason for that is that we can't record all
necessary
dependencies there because the functions are black box for parser. That
means if somebody drops object that an UDF used in replication filter
depends on, that function will start failing. But unlike for user
sessions it will start failing during decoding (well processing in
output plugin). And that's not recoverable by reading the missing
object, the only way to get out of that is either to move slot forward
which means losing part of replication stream and need for manual
resync
or full rebuild of replication. Neither of which are good IMHO.
It is a foot gun but there are several ways to do bad things in
postgres. CREATE PUBLICATION is restricted to superusers and role with
CREATE privilege in current database. AFAICS a role with CREATE
privilege cannot drop objects whose owner is not himself. I wouldn't
like to disallow UDFs in row filtering expressions just because
someone doesn't set permissions correctly. Do you have any other case
in mind?I don't think this has anything to do with security. Stupid example:
user1: CREATE EXTENSION citext;
user2: CREATE FUNCTION myfilter(col1 text, col2 text) returns boolean
language plpgsql as
$$BEGIN
RETURN col1::citext = col2::citext;
END;$$user2: ALTER PUBLICATION mypub ADD TABLE mytab WHERE (myfilter(a,b));
[... replication happening ...]
user1: DROP EXTENSION citext;
And now replication is broken and unrecoverable without data loss.
Recreating extension will not help because the changes happening in
meantime will not see it in the historical snapshot.I don't think it's okay to do completely nothing about this.
If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On 23/11/2018 19:05, Fabrízio de Royes Mello wrote:
On Fri, Nov 23, 2018 at 3:55 PM Petr Jelinek
<petr.jelinek@2ndquadrant.com <mailto:petr.jelinek@2ndquadrant.com>> wrote:On 23/11/2018 17:15, Euler Taveira wrote:
Em qui, 22 de nov de 2018 às 20:03, Petr Jelinek
<petr.jelinek@2ndquadrant.com <mailto:petr.jelinek@2ndquadrant.com>>escreveu:
Firstly, I am not sure if it's wise to allow UDFs in the filter clause
for the table. The reason for that is that we can't record allnecessary
dependencies there because the functions are black box for parser. That
means if somebody drops object that an UDF used in replication filter
depends on, that function will start failing. But unlike for user
sessions it will start failing during decoding (well processing in
output plugin). And that's not recoverable by reading the missing
object, the only way to get out of that is either to move slot forward
which means losing part of replication stream and need for manualresync
or full rebuild of replication. Neither of which are good IMHO.
It is a foot gun but there are several ways to do bad things in
postgres. CREATE PUBLICATION is restricted to superusers and role with
CREATE privilege in current database. AFAICS a role with CREATE
privilege cannot drop objects whose owner is not himself. I wouldn't
like to disallow UDFs in row filtering expressions just because
someone doesn't set permissions correctly. Do you have any other case
in mind?I don't think this has anything to do with security. Stupid example:
user1: CREATE EXTENSION citext;
user2: CREATE FUNCTION myfilter(col1 text, col2 text) returns boolean
language plpgsql as
$$BEGIN
RETURN col1::citext = col2::citext;
END;$$user2: ALTER PUBLICATION mypub ADD TABLE mytab WHERE (myfilter(a,b));
[... replication happening ...]
user1: DROP EXTENSION citext;
And now replication is broken and unrecoverable without data loss.
Recreating extension will not help because the changes happening in
meantime will not see it in the historical snapshot.I don't think it's okay to do completely nothing about this.
If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.
The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:
If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.
In this case with logical replication you should rsync the object. That is
the price of misunderstanding / bad use of the new feature.
As usual, there are no free beer ;-)
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Greetings,
* Fabrízio de Royes Mello (fabriziomello@gmail.com) wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That is
the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
There's also certainly no shortage of other ways to break logical
replication, including ways that would also be hard to recover from
today other than doing a full resync.
What that seems to indicate, to me at least, is that it'd be awful nice
to have a way to resync the data which doesn't necessairly involve
transferring all of it over again.
Of course, it'd be nice if we could track those dependencies too,
but that's yet another thing.
In short, I'm not sure that I agree with the idea that we shouldn't
allow this and instead I'd rather we realize it and put the logical
replication into some kind of an error state that requires a resync.
Thanks!
Stephen
On 23/11/2018 19:29, Fabrízio de Royes Mello wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek
<petr.jelinek@2ndquadrant.com <mailto:petr.jelinek@2ndquadrant.com>> wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That
is the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
Yeah but you have to resync whole subscription, not just single table
(removing table from the publication will also not help), that's pretty
severe punishment. What if you have triggers downstream that do
calculations or logging which you can't recover by simply rebuilding
replica? I think it's better to err on the side of no data loss.
We could also try to figure out a way to recover from this that does not
require resync, ie perhaps we could somehow temporarily force evaluation
of the expression to have current snapshot.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 11/23/18 8:03 PM, Stephen Frost wrote:
Greetings,
* Fabr�zio de Royes Mello (fabriziomello@gmail.com) wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That is
the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
There's also certainly no shortage of other ways to break logical
replication, including ways that would also be hard to recover from
today other than doing a full resync.
Sure, but that seems more like an argument against creating additional
ones (and for preventing those that already exist). I'm not sure this
particular feature is where we should draw the line, though.
What that seems to indicate, to me at least, is that it'd be awful
nice to have a way to resync the data which doesn't necessairly
involve transferring all of it over again.Of course, it'd be nice if we could track those dependencies too,
but that's yet another thing.
Yep, that seems like a good idea in general. Both here and for
functional indexes (although I suppose sure is a technical reason why it
wasn't implemented right away for them).
In short, I'm not sure that I agree with the idea that we shouldn't
allow this and instead I'd rather we realize it and put the logical
replication into some kind of an error state that requires a resync.
That would still mean a need to resync the data to recover, so I'm not
sure it's really an improvement. And I suppose it'd require tracking the
dependencies, because how else would you mark the subscription as
requiring a resync? At which point we could decline the DROP without a
CASCADE, just like we do elsewhere, no?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/23/18 8:14 PM, Petr Jelinek wrote:
On 23/11/2018 19:29, Fabrízio de Royes Mello wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek
<petr.jelinek@2ndquadrant.com <mailto:petr.jelinek@2ndquadrant.com>> wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That
is the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
Yeah but you have to resync whole subscription, not just single table
(removing table from the publication will also not help), that's pretty
severe punishment. What if you have triggers downstream that do
calculations or logging which you can't recover by simply rebuilding
replica? I think it's better to err on the side of no data loss.
Yeah, having to resync everything because you accidentally dropped a
function is quite annoying. Of course, you should notice that while
testing the upgrade in a testing environment, but still ...
We could also try to figure out a way to recover from this that does not
require resync, ie perhaps we could somehow temporarily force evaluation
of the expression to have current snapshot.
That seems like huge a can of worms ...
cheers
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Greetings,
* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
On 11/23/18 8:03 PM, Stephen Frost wrote:
* Fabrízio de Royes Mello (fabriziomello@gmail.com) wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That is
the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
There's also certainly no shortage of other ways to break logical
replication, including ways that would also be hard to recover from
today other than doing a full resync.Sure, but that seems more like an argument against creating additional
ones (and for preventing those that already exist). I'm not sure this
particular feature is where we should draw the line, though.
I was actually going in the other direction- we should allow it because
advanced users may know what they're doing better than we do and we
shouldn't prevent things just because they might be misused or
misunderstood by a user.
What that seems to indicate, to me at least, is that it'd be awful
nice to have a way to resync the data which doesn't necessairly
involve transferring all of it over again.Of course, it'd be nice if we could track those dependencies too,
but that's yet another thing.Yep, that seems like a good idea in general. Both here and for
functional indexes (although I suppose sure is a technical reason why it
wasn't implemented right away for them).
We don't track function dependencies in general and I could certainly
see cases where you really wouldn't want to do so, at least not in the
same way that we track FKs or similar. I do wonder if maybe we didn't
track function dependencies because we didn't (yet) have create or
replace function and that now we should. We don't track dependencies
inside a function either though.
In short, I'm not sure that I agree with the idea that we shouldn't
allow this and instead I'd rather we realize it and put the logical
replication into some kind of an error state that requires a resync.That would still mean a need to resync the data to recover, so I'm not
sure it's really an improvement. And I suppose it'd require tracking the
dependencies, because how else would you mark the subscription as
requiring a resync? At which point we could decline the DROP without a
CASCADE, just like we do elsewhere, no?
I was actually thinking more along the lines of just simply marking the
publication/subscription as being in a 'failed' state when a failure
actually happens, and maybe even at that point basically throwing away
everything except the shell of the publication/subscription (so the user
can see that it failed and come in and properly drop it); I'm thinking
about this as perhaps similar to a transaction being aborted.
Thanks!
Stephen
On 14/12/2018 16:56, Stephen Frost wrote:
Greetings,
* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
On 11/23/18 8:03 PM, Stephen Frost wrote:
* Fabr�zio de Royes Mello (fabriziomello@gmail.com) wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That is
the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
There's also certainly no shortage of other ways to break logical
replication, including ways that would also be hard to recover from
today other than doing a full resync.Sure, but that seems more like an argument against creating additional
ones (and for preventing those that already exist). I'm not sure this
particular feature is where we should draw the line, though.I was actually going in the other direction- we should allow it because
advanced users may know what they're doing better than we do and we
shouldn't prevent things just because they might be misused or
misunderstood by a user.
That's all good, but we need good escape hatch for when things go south
and we don't have it and IMHO it's not as easy to have one as you might
think.
That's why I would do the simple and safe way first before allowing
more, otherwise we'll be discussing this for next couple of PG versions.
What that seems to indicate, to me at least, is that it'd be awful
nice to have a way to resync the data which doesn't necessairly
involve transferring all of it over again.Of course, it'd be nice if we could track those dependencies too,
but that's yet another thing.Yep, that seems like a good idea in general. Both here and for
functional indexes (although I suppose sure is a technical reason why it
wasn't implemented right away for them).We don't track function dependencies in general and I could certainly
see cases where you really wouldn't want to do so, at least not in the
same way that we track FKs or similar. I do wonder if maybe we didn't
track function dependencies because we didn't (yet) have create or
replace function and that now we should. We don't track dependencies
inside a function either though.
Yeah we can't always have dependencies, it would break some perfectly
valid usage scenarios. Also it's not exactly clear to me how we'd track
dependencies of say plpython function...
In short, I'm not sure that I agree with the idea that we shouldn't
allow this and instead I'd rather we realize it and put the logical
replication into some kind of an error state that requires a resync.That would still mean a need to resync the data to recover, so I'm not
sure it's really an improvement. And I suppose it'd require tracking the
dependencies, because how else would you mark the subscription as
requiring a resync? At which point we could decline the DROP without a
CASCADE, just like we do elsewhere, no?I was actually thinking more along the lines of just simply marking the
publication/subscription as being in a 'failed' state when a failure
actually happens, and maybe even at that point basically throwing away
everything except the shell of the publication/subscription (so the user
can see that it failed and come in and properly drop it); I'm thinking
about this as perhaps similar to a transaction being aborted.
There are several problems with that. First this happens in historic
snapshot which can't write and on top of that we are in the middle of
error processing so we have our hands tied a bit, it's definitely going
to need bit of creative thinking to do this.
Second, and that's more soft issue (which is probably harder to solve)
what do we do with the slot and subscription. There is one failed
publication, but the subscription may be subscribed to 20 of them, do we
kill the whole subscription because of single failed publication? If we
don't do we continue replicating like nothing has happened but with data
in the failed publication missing (which can be considered data
loss/corruption from the view of user). If we stop replication, do we
clean the slot so that we don't keep back wal/catalog xmin forever
(which could lead to server stopping) or do we keep the slot so that
user can somehow fix the issue (reconfigure subscription to not care
about that publication for example) and continue replication without
further loss?
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Greetings,
* Petr Jelinek (petr.jelinek@2ndquadrant.com) wrote:
On 14/12/2018 16:56, Stephen Frost wrote:
* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
On 11/23/18 8:03 PM, Stephen Frost wrote:
* Fabrízio de Royes Mello (fabriziomello@gmail.com) wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That is
the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
There's also certainly no shortage of other ways to break logical
replication, including ways that would also be hard to recover from
today other than doing a full resync.Sure, but that seems more like an argument against creating additional
ones (and for preventing those that already exist). I'm not sure this
particular feature is where we should draw the line, though.I was actually going in the other direction- we should allow it because
advanced users may know what they're doing better than we do and we
shouldn't prevent things just because they might be misused or
misunderstood by a user.That's all good, but we need good escape hatch for when things go south
and we don't have it and IMHO it's not as easy to have one as you might
think.
We don't have a great solution but we should be able to at least drop
and recreate the publication or subscription, even today, can't we?
Sure, that means having to recopy everything, but that's what you get if
you break your publication/subscription. If we allow the user to get to
a point where the system can't be fixed then I agree that's a serious
issue, but hopefully that isn't the case.
What that seems to indicate, to me at least, is that it'd be awful
nice to have a way to resync the data which doesn't necessairly
involve transferring all of it over again.Of course, it'd be nice if we could track those dependencies too,
but that's yet another thing.Yep, that seems like a good idea in general. Both here and for
functional indexes (although I suppose sure is a technical reason why it
wasn't implemented right away for them).We don't track function dependencies in general and I could certainly
see cases where you really wouldn't want to do so, at least not in the
same way that we track FKs or similar. I do wonder if maybe we didn't
track function dependencies because we didn't (yet) have create or
replace function and that now we should. We don't track dependencies
inside a function either though.Yeah we can't always have dependencies, it would break some perfectly
valid usage scenarios. Also it's not exactly clear to me how we'd track
dependencies of say plpython function...
Well, we could at leasts depend on the functions explicitly listed at
the top level and I don't believe we even do that today. I can't think
of any downside off-hand to that, given that we have create-or-replace
function.
In short, I'm not sure that I agree with the idea that we shouldn't
allow this and instead I'd rather we realize it and put the logical
replication into some kind of an error state that requires a resync.That would still mean a need to resync the data to recover, so I'm not
sure it's really an improvement. And I suppose it'd require tracking the
dependencies, because how else would you mark the subscription as
requiring a resync? At which point we could decline the DROP without a
CASCADE, just like we do elsewhere, no?I was actually thinking more along the lines of just simply marking the
publication/subscription as being in a 'failed' state when a failure
actually happens, and maybe even at that point basically throwing away
everything except the shell of the publication/subscription (so the user
can see that it failed and come in and properly drop it); I'm thinking
about this as perhaps similar to a transaction being aborted.There are several problems with that. First this happens in historic
snapshot which can't write and on top of that we are in the middle of
error processing so we have our hands tied a bit, it's definitely going
to need bit of creative thinking to do this.
We can't write to things inside the database in a historic snapshot and
we do have to deal with the fact that we're in error processing. What
about writing somewhere that's outside of the regular database system?
Maybe a pg_logical/failed directory? There's all the usual
complications from that around dealing with durable writes (if we need
to worry about that and I'm not sure that we do... if we fail to
persist a write saying "X failed" and we restart.. well, it's gonna fail
again and we write it then), and cleaning things up as needed (but maybe
this is handled as part of the DROP, and we WAL that, so we can re-do
the removal of the failed marker file...), and if we need to think about
what should happen on replicas (is there anything?).
Second, and that's more soft issue (which is probably harder to solve)
what do we do with the slot and subscription. There is one failed
publication, but the subscription may be subscribed to 20 of them, do we
kill the whole subscription because of single failed publication? If we
don't do we continue replicating like nothing has happened but with data
in the failed publication missing (which can be considered data
loss/corruption from the view of user). If we stop replication, do we
clean the slot so that we don't keep back wal/catalog xmin forever
(which could lead to server stopping) or do we keep the slot so that
user can somehow fix the issue (reconfigure subscription to not care
about that publication for example) and continue replication without
further loss?
I would think we'd have to fail the whole publication if there's a
failure for any part of it. Replicating a partial set definitely sounds
wrong to me. Once we stop replication, yes, we should clean the slot
and mark it failed so that we don't keep back WAL and so that we allow
the catalog xmin to move forward so that the failed publication doesn't
run the server out of disk space.
If we really think there's a use-case for keeping the replication slot
and allowing it to cause WAL to spool on the server and keep the catalog
xmin back then I'd suggest we make this behavior configurable- so that
users can choose on a publication if they want a failure to be
considered a 'soft' fail or a 'hard' fail. A 'soft' fail would keep the
slot and keep the WAL and keep the catalog xmin, with the expectation
that the user will either drop the slot themselves or somehow fix it,
while a 'hard' fail would clean everything up except the skeleton of the
slot itself which the user would need to drop.
Thanks!
Stephen
On 27/12/2018 20:19, Stephen Frost wrote:
Greetings,
* Petr Jelinek (petr.jelinek@2ndquadrant.com) wrote:
On 14/12/2018 16:56, Stephen Frost wrote:
* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
On 11/23/18 8:03 PM, Stephen Frost wrote:
* Fabr�zio de Royes Mello (fabriziomello@gmail.com) wrote:
On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr.jelinek@2ndquadrant.com>
wrote:If carefully documented I see no problem with it... we already have an
analogous problem with functional indexes.The difference is that with functional indexes you can recreate the
missing object and everything is okay again. With logical replication
recreating the object will not help.In this case with logical replication you should rsync the object. That is
the price of misunderstanding / bad use of the new feature.As usual, there are no free beer ;-)
There's also certainly no shortage of other ways to break logical
replication, including ways that would also be hard to recover from
today other than doing a full resync.Sure, but that seems more like an argument against creating additional
ones (and for preventing those that already exist). I'm not sure this
particular feature is where we should draw the line, though.I was actually going in the other direction- we should allow it because
advanced users may know what they're doing better than we do and we
shouldn't prevent things just because they might be misused or
misunderstood by a user.That's all good, but we need good escape hatch for when things go south
and we don't have it and IMHO it's not as easy to have one as you might
think.We don't have a great solution but we should be able to at least drop
and recreate the publication or subscription, even today, can't we?
Well we can drop thing always, yes, not having ability to drop things
when they break would be bad design. I am debating ability to recover
without rebuilding everything a there are cases where you simply can't
rebuild everything (ie we allow filtering out deletes). I don't like
disabling UDFs either as that means that user created types are unusable
in filters, I just wonder if saying "sorry your replica is gone" is any
better.
Sure, that means having to recopy everything, but that's what you get if
you break your publication/subscription.
This is but off-topic here, but I really wonder how are you currently
breaking your publications/subscriptions.
What that seems to indicate, to me at least, is that it'd be awful
nice to have a way to resync the data which doesn't necessairly
involve transferring all of it over again.Of course, it'd be nice if we could track those dependencies too,
but that's yet another thing.Yep, that seems like a good idea in general. Both here and for
functional indexes (although I suppose sure is a technical reason why it
wasn't implemented right away for them).We don't track function dependencies in general and I could certainly
see cases where you really wouldn't want to do so, at least not in the
same way that we track FKs or similar. I do wonder if maybe we didn't
track function dependencies because we didn't (yet) have create or
replace function and that now we should. We don't track dependencies
inside a function either though.Yeah we can't always have dependencies, it would break some perfectly
valid usage scenarios. Also it's not exactly clear to me how we'd track
dependencies of say plpython function...Well, we could at leasts depend on the functions explicitly listed at
the top level and I don't believe we even do that today. I can't think
of any downside off-hand to that, given that we have create-or-replace
function.
I dunno how much is that worth it TBH, the situations where I've seen
this issue (pglogical has this feature for long time and suffers from
the same lack of dependency tracking) is that somebody drops table/type
used in a function that is used as filter.
In short, I'm not sure that I agree with the idea that we shouldn't
allow this and instead I'd rather we realize it and put the logical
replication into some kind of an error state that requires a resync.That would still mean a need to resync the data to recover, so I'm not
sure it's really an improvement. And I suppose it'd require tracking the
dependencies, because how else would you mark the subscription as
requiring a resync? At which point we could decline the DROP without a
CASCADE, just like we do elsewhere, no?I was actually thinking more along the lines of just simply marking the
publication/subscription as being in a 'failed' state when a failure
actually happens, and maybe even at that point basically throwing away
everything except the shell of the publication/subscription (so the user
can see that it failed and come in and properly drop it); I'm thinking
about this as perhaps similar to a transaction being aborted.There are several problems with that. First this happens in historic
snapshot which can't write and on top of that we are in the middle of
error processing so we have our hands tied a bit, it's definitely going
to need bit of creative thinking to do this.We can't write to things inside the database in a historic snapshot and
we do have to deal with the fact that we're in error processing. What
about writing somewhere that's outside of the regular database system?
Maybe a pg_logical/failed directory? There's all the usual
complications from that around dealing with durable writes (if we need
to worry about that and I'm not sure that we do... if we fail to
persist a write saying "X failed" and we restart.. well, it's gonna fail
again and we write it then), and cleaning things up as needed (but maybe
this is handled as part of the DROP, and we WAL that, so we can re-do
the removal of the failed marker file...), and if we need to think about
what should happen on replicas (is there anything?).
That sounds pretty reasonable. Given that this is corner-case user error
we could perhaps do extra work to ensure things are fsynced even if it's
all not too fast...
Second, and that's more soft issue (which is probably harder to solve)
what do we do with the slot and subscription. There is one failed
publication, but the subscription may be subscribed to 20 of them, do we
kill the whole subscription because of single failed publication? If we
don't do we continue replicating like nothing has happened but with data
in the failed publication missing (which can be considered data
loss/corruption from the view of user). If we stop replication, do we
clean the slot so that we don't keep back wal/catalog xmin forever
(which could lead to server stopping) or do we keep the slot so that
user can somehow fix the issue (reconfigure subscription to not care
about that publication for example) and continue replication without
further loss?I would think we'd have to fail the whole publication if there's a
failure for any part of it. Replicating a partial set definitely sounds
wrong to me. Once we stop replication, yes, we should clean the slot
and mark it failed so that we don't keep back WAL and so that we allow
the catalog xmin to move forward so that the failed publication doesn't
run the server out of disk space.
I agree that continuing replication where some part of publication is
broken seems wrong and that we should stop replication at that point.
If we really think there's a use-case for keeping the replication slot
It's not so much about use-case as it is about complete change of
behavior - there is no current error where we remove existing slot.
The use case for keeping slot is a) investigation of the issue, b) just
skipping the broken part of stream by advancing origin on subscription
and continuing replication, with some luck that can mean only single
table needs resyncing, which is better than rebuilding everything.
I think some kind of automated slot cleanup is desirable, but likely
separate feature that should be designed based on amount of outstanding
wal or something.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hi,
On 2018-11-23 13:15:08 -0300, Euler Taveira wrote:
Besides the problem presented by Hironobu-san, I'm doing some cleanup
and improving docs. I also forget to declare pg_publication_rel TOAST
table.Thanks for your review.
As far as I can tell, the patch has not been refreshed since. So I'm
marking this as returned with feedback for now. Please resubmit once
ready.
Greetings,
Andres Freund
Hi Euler,
On 2019-02-03 13:14, Andres Freund wrote:
On 2018-11-23 13:15:08 -0300, Euler Taveira wrote:
Besides the problem presented by Hironobu-san, I'm doing some cleanup
and improving docs. I also forget to declare pg_publication_rel TOAST
table.Thanks for your review.
As far as I can tell, the patch has not been refreshed since. So I'm
marking this as returned with feedback for now. Please resubmit once
ready.
Do you have any plans for continuing working on this patch and
submitting it again on the closest September commitfest? There are only
a few days left. Anyway, I will be glad to review the patch if you do
submit it, though I didn't yet dig deeply into the code.
I've rebased recently the entire patch set (attached) and it works fine.
Your tap test is passed. Also I've added a new test case (see 0009
attached) with real life example of bidirectional replication (BDR)
utilising this new WHERE clause. This naive BDR is implemented using
is_cloud flag, which is set to TRUE/FALSE on cloud/remote nodes
respectively.
Although almost all new tests are passed, there is a problem with DELETE
replication, so 1 out of 10 tests is failed. It isn't replicated if the
record was created with is_cloud=TRUE on cloud, replicated to remote;
then updated with is_cloud=FALSE on remote, replicated to cloud; then
deleted on remote.
Regards
--
Alexey Kondratov
Postgres Professional https://www.postgrespro.com
Russian Postgres Company
Attachments:
v2-0001-Remove-unused-atttypmod-column-from-initial-table.patchtext/x-diff; name=v2-0001-Remove-unused-atttypmod-column-from-initial-table.patchDownload
From ae80e1616fb6374968a09e3c44f0abe59ebf3a87 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 18:39:22 +0000
Subject: [PATCH v2 1/9] Remove unused atttypmod column from initial table
synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..0a565dd837 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -647,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
- Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -691,7 +691,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -703,7 +702,7 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->remoteid,
(walrcv_server_version(wrconn) >= 120000 ? "AND a.attgenerated = ''" : ""),
lrel->remoteid);
- res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+ res = walrcv_exec(wrconn, cmd.data, 3, attrRow);
if (res->status != WALRCV_OK_TUPLES)
ereport(ERROR,
@@ -724,7 +723,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
base-commit: 9acda731184c1ebdf99172cbb19d0404b7eebc37
--
2.19.1
v2-0002-Store-number-of-tuples-in-WalRcvExecResult.patchtext/x-diff; name=v2-0002-Store-number-of-tuples-in-WalRcvExecResult.patchDownload
From 362f2cc97745690ff4739b530f5ba95aea59be09 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 17:37:36 +0000
Subject: [PATCH v2 2/9] Store number of tuples in WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
While in it, since we have the number of columns, allocate only nfields
for cstrs instead of MaxTupleAttributeNumber.
---
.../replication/libpqwalreceiver/libpqwalreceiver.c | 9 ++++++---
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..846b6f89f1 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -878,6 +878,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -888,7 +889,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -897,15 +898,17 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
- char *cstrs[MaxTupleAttributeNumber];
+ char **cstrs;
ProcessWalRcvInterrupts();
/* Do the allocations in temporary context. */
oldcontext = MemoryContextSwitchTo(rowcontext);
+ cstrs = palloc(nfields * sizeof(char *));
+
/*
* Fill cstrs with null-terminated strings of column values.
*/
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 0a565dd837..42db4ada9e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -709,9 +709,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e12a934966..0d32d598d8 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -196,6 +196,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* libpqwalreceiver hooks */
--
2.19.1
v2-0003-Refactor-function-create_estate_for_relation.patchtext/x-diff; name=v2-0003-Refactor-function-create_estate_for_relation.patchDownload
From 2b23421d208c4760e7b3d38adc57ec62685b2d35 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH v2 3/9] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 11e6331f49..d9952c8b7e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -173,7 +173,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -183,13 +183,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -589,7 +589,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -696,7 +696,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -815,7 +815,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.19.1
v2-0004-Rename-a-WHERE-node.patchtext/x-diff; name=v2-0004-Rename-a-WHERE-node.patchDownload
From b5e55cace866ad8b4de554fe6cd326ad2d11fe81 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH v2 4/9] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb367f8..1de8f56794 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -476,7 +476,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3710,7 +3710,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3812,7 +3812,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.19.1
v2-0005-Row-filtering-for-logical-replication.patchtext/x-diff; name=v2-0005-Row-filtering-for-logical-replication.patchDownload
From fcb9a06babe9ce94bff256e93ae1a2f44c4532b6 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH v2 5/9] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/ref/alter_publication.sgml | 9 +-
doc/src/sgml/ref/create_publication.sgml | 14 ++-
src/backend/catalog/pg_publication.c | 46 +++++++-
src/backend/commands/publicationcmds.c | 74 ++++++++----
src/backend/parser/gram.y | 26 ++++-
src/backend/parser/parse_agg.c | 10 ++
src/backend/parser/parse_expr.c | 5 +
src/backend/parser/parse_func.c | 2 +
src/backend/replication/logical/proto.c | 2 +-
src/backend/replication/logical/relation.c | 13 +++
src/backend/replication/logical/tablesync.c | 119 ++++++++++++++++++--
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 98 +++++++++++++++-
src/include/catalog/pg_publication.h | 10 +-
src/include/catalog/pg_publication_rel.h | 10 +-
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalproto.h | 2 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/misc_sanity.out | 3 +-
src/test/subscription/t/010_row_filter.pl | 97 ++++++++++++++++
22 files changed, 499 insertions(+), 58 deletions(-)
create mode 100644 src/test/subscription/t/010_row_filter.pl
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598d93..5984915767 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,10 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the expression.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..d5fed304a1 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">expression</replaceable> will not be
+ published. Note that parentheses are required around the expression.
</para>
<para>
@@ -183,6 +186,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d5f7..47a793408d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,10 @@
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -149,18 +153,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -180,10 +187,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -197,6 +221,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add row filter, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prrowfilter - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prrowfilter - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -213,11 +243,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the row filter expression */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4bf80..bc7f9210e9 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -352,6 +352,27 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid the
+ * WHERE clause in it. The use of relation_expr_list node just for the
+ * DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -373,9 +394,9 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ PublicationRelationQual *newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == oldrelid)
{
found = true;
break;
@@ -384,7 +405,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
if (!found)
{
- Relation oldrel = table_open(oldrelid,
+ PublicationRelationQual *oldrel = palloc(sizeof(PublicationRelationQual));
+ oldrel->relation = table_open(oldrelid,
ShareUpdateExclusiveLock);
delrels = lappend(delrels, oldrel);
@@ -510,16 +532,22 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
- Relation rel;
- Oid myrelid;
+ // RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ // bool recurse = rv->inh;
+ // Relation rel;
+ // Oid myrelid;
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = t->relation;
+ Relation rel;
+ bool recurse = rv->inh;
+ Oid myrelid;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -539,8 +567,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/* Add children of this rel, if requested */
@@ -568,7 +598,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -589,10 +623,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -608,13 +644,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -640,8 +676,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -654,7 +690,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 1de8f56794..bd87e80e1b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -404,13 +404,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9518,7 +9518,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9549,7 +9549,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9557,7 +9557,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9565,7 +9565,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9575,6 +9575,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index f418c61545..e317e04695 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -536,6 +536,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in CALL arguments");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_COPY_WHERE:
if (isAgg)
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 76f3dd7076..4942f28f50 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -571,6 +571,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1924,6 +1925,8 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
break;
case EXPR_KIND_COPY_WHERE:
err = _("cannot use subquery in COPY FROM WHERE condition");
@@ -3561,6 +3564,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 8e926539e6..c7ec300d0e 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2509,6 +2509,8 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("set-returning functions are not allowed in CALL arguments");
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
break;
case EXPR_KIND_COPY_WHERE:
err = _("set-returning functions are not allowed in COPY FROM WHERE conditions");
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index e7df47de3e..eb4eaa2a33 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -378,7 +378,7 @@ logicalrep_write_rel(StringInfo out, Relation rel)
LogicalRepRelation *
logicalrep_read_rel(StringInfo in)
{
- LogicalRepRelation *rel = palloc(sizeof(LogicalRepRelation));
+ LogicalRepRelation *rel = palloc0(sizeof(LogicalRepRelation));
rel->remoteid = pq_getmsgint(in, 4);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 85269c037d..a14986e3ab 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -140,6 +140,16 @@ logicalrep_relmap_free_entry(LogicalRepRelMapEntry *entry)
}
bms_free(remoterel->attkeys);
+ if (remoterel->nrowfilters > 0)
+ {
+ int i;
+
+ for (i = 0; i < remoterel->nrowfilters; i++)
+ pfree(remoterel->rowfiltercond[i]);
+
+ pfree(remoterel->rowfiltercond);
+ }
+
if (entry->attrmap)
pfree(entry->attrmap);
}
@@ -187,6 +197,9 @@ logicalrep_relmap_update(LogicalRepRelation *remoterel)
}
entry->remoterel.replident = remoterel->replident;
entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+ entry->remoterel.rowfiltercond = palloc(remoterel->nrowfilters * sizeof(char *));
+ for (i = 0; i < remoterel->nrowfilters; i++)
+ entry->remoterel.rowfiltercond[i] = pstrdup(remoterel->rowfiltercond[i]);
MemoryContextSwitchTo(oldctx);
}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 42db4ada9e..fc37f74e89 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -648,8 +648,14 @@ fetch_remote_table_info(char *nspname, char *relname,
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
+ Oid rowfilterRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -713,20 +719,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -734,7 +740,54 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Fetch row filtering info */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd, "SELECT pg_get_expr(prrowfilter, prrelid) FROM pg_publication p INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) WHERE pr.prrelid = %u AND p.pubname IN (", MyLogicalRepWorker->relid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, rowfilterRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch row filter info for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ lrel->rowfiltercond = palloc0(res->ntuples * sizeof(char *));
+
+ n = 0;
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ {
+ char *p = TextDatumGetCString(rf);
+ lrel->rowfiltercond[n++] = p;
+ }
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ lrel->nrowfilters = n;
walrcv_clear_result(res);
pfree(cmd.data);
@@ -767,10 +820,57 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- appendStringInfo(&cmd, "COPY %s TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /*
+ * If publication has any row filter, build a SELECT query with OR'ed row
+ * filters for COPY.
+ * If no row filters are available, use COPY for all
+ * table contents.
+ */
+ if (lrel.nrowfilters > 0)
+ {
+ ListCell *lc;
+ bool first;
+ int i;
+
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
+ appendStringInfo(&cmd, " FROM %s",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfoString(&cmd, " WHERE ");
+ /* list of OR'ed filters */
+ first = true;
+ for (i = 0; i < lrel.nrowfilters; i++)
+ {
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", lrel.rowfiltercond[i]);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
+ }
+ else
+ {
+ appendStringInfo(&cmd, "COPY %s TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ }
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -785,7 +885,6 @@ copy_table(Relation rel)
addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d9952c8b7e..cef0c52955 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -172,7 +172,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..e9646ac483 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,15 +12,26 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "fmgr.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
#include "utils/memutils.h"
@@ -60,6 +71,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *row_filter;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -335,6 +347,63 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->row_filter) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach (lc, relentry->row_filter)
+ {
+ Node *row_filter;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ row_filter = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(row_filter);
+ expr = (Expr *) coerce_to_target_type(NULL, row_filter, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ MemoryContextSwitchTo(oldcxt);
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+ return;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -570,10 +639,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->row_filter = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
if (pub->alltables || list_member_oid(pubids, pub->oid))
{
@@ -583,9 +656,23 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prrowfilter, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+ entry->row_filter = lappend(entry->row_filter, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -660,5 +747,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->row_filter) > 0)
+ list_free(entry->row_filter);
+ entry->row_filter = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..fe878c6957 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -78,6 +78,12 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -86,8 +92,8 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
- bool if_not_exists);
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
+ bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92ab3..70ffef5dfd 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -28,9 +28,13 @@
*/
CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
{
- Oid oid; /* oid */
- Oid prpubid; /* Oid of the publication */
- Oid prrelid; /* Oid of the relation */
+ Oid oid; /* oid */
+ Oid prpubid; /* Oid of the publication */
+ Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prrowfilter; /* nodeToString representation of row filter */
+#endif
} FormData_pg_publication_rel;
/* ----------------
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 3cbb08df92..7f83da1ee8 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -476,6 +476,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 94ded3c135..359f773092 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3461,12 +3461,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3479,7 +3486,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 7c099e7084..c2e8b9fcb9 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -73,6 +73,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 3fc430af01..8dabc35791 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -50,6 +50,8 @@ typedef struct LogicalRepRelation
Oid *atttyps; /* column types */
char replident; /* replica identity */
Bitmapset *attkeys; /* Bitmap of key columns */
+ char **rowfiltercond; /* condition for row filtering */
+ int nrowfilters; /* number of row filters */
} LogicalRepRelation;
/* Type mapping info */
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 2642a3f94e..5cc307ee0e 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out
index 8538173ff8..68bbdf7016 100644
--- a/src/test/regress/expected/misc_sanity.out
+++ b/src/test/regress/expected/misc_sanity.out
@@ -107,5 +107,6 @@ ORDER BY 1, 2;
pg_index | indpred | pg_node_tree
pg_largeobject | data | bytea
pg_largeobject_metadata | lomacl | aclitem[]
-(11 rows)
+ pg_publication_rel | prrowfilter | pg_node_tree
+(12 rows)
diff --git a/src/test/subscription/t/010_row_filter.pl b/src/test/subscription/t/010_row_filter.pl
new file mode 100644
index 0000000000..6c174fa895
--- /dev/null
+++ b/src/test/subscription/t/010_row_filter.pl
@@ -0,0 +1,97 @@
+# Teste logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+
+my $result = $node_publisher->psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 DROP TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+is($result, 3, "syntax error for ALTER PUBLICATION DROP TABLE");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 2 = 0)");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3");
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)");
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1003) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 10)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+#$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter_1");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002
+1003|test 1003), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(7|2|10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.19.1
v2-0006-Print-publication-WHERE-condition-in-psql.patchtext/x-diff; name=v2-0006-Print-publication-WHERE-condition-in-psql.patchDownload
From 19cb78c319e3755f412b3d622e68837ccaf05496 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Thu, 17 May 2018 20:52:28 +0000
Subject: [PATCH v2 6/9] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 774cc764ff..f103747901 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5875,7 +5875,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prrowfilter, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5905,6 +5906,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.19.1
v2-0007-Publication-where-condition-support-for-pg_dump.patchtext/x-diff; name=v2-0007-Publication-where-condition-support-for-pg_dump.patchDownload
From c773fc66047feed639b50648f86d9a5cb783fc07 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Sat, 15 Sep 2018 02:52:00 +0000
Subject: [PATCH v2 7/9] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34981401bf..91475bc5f8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3959,6 +3959,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -3991,7 +3992,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prrowfilter, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -4012,6 +4014,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4027,6 +4030,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4055,8 +4063,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index dfba58ac58..3ed2e1be9c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -608,6 +608,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.19.1
v2-0008-Debug-for-row-filtering.patchtext/x-diff; name=v2-0008-Debug-for-row-filtering.patchDownload
From 61a3e08edd5c3ce3cbcc750a2c074922afd6676d Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 14 Mar 2018 00:53:17 +0000
Subject: [PATCH v2 8/9] Debug for row filtering
---
src/backend/commands/publicationcmds.c | 11 ++++
src/backend/replication/logical/tablesync.c | 1 +
src/backend/replication/pgoutput/pgoutput.c | 66 +++++++++++++++++++++
3 files changed, 78 insertions(+)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index bc7f9210e9..8e107bddfb 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -341,6 +341,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ ListCell *xpto;
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -352,6 +353,16 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ foreach(xpto, stmt->tables)
+ {
+ PublicationTable *t = lfirst(xpto);
+
+ if (t->whereClause == NULL)
+ elog(DEBUG3, "publication \"%s\" has no WHERE clause", NameStr(pubform->pubname));
+ else
+ elog(DEBUG3, "publication \"%s\" has WHERE clause", NameStr(pubform->pubname));
+ }
+
/*
* ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
* publication_table_list node (that accepts a WHERE clause) but forbid the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index fc37f74e89..c86affad03 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -871,6 +871,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
}
+ elog(DEBUG2, "COPY for initial synchronization: %s", cmd.data);
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index e9646ac483..5012cfdde7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -34,6 +34,7 @@
#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -323,6 +324,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
MemoryContext old;
RelationSyncEntry *relentry;
+ Form_pg_class class_form;
+ char *schemaname;
+ char *tablename;
+
if (!is_publishable_relation(relation))
return;
@@ -347,6 +352,17 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ class_form = RelationGetForm(relation);
+ schemaname = get_namespace_name(class_form->relnamespace);
+ tablename = NameStr(class_form->relname);
+
+ if (change->action == REORDER_BUFFER_CHANGE_INSERT)
+ elog(DEBUG1, "INSERT \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_UPDATE)
+ elog(DEBUG1, "UPDATE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_DELETE)
+ elog(DEBUG1, "DELETE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+
/* ... then check row filter */
if (list_length(relentry->row_filter) > 0)
{
@@ -363,6 +379,42 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
+#ifdef _NOT_USED
+ if (old_tuple)
+ {
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute attr;
+ HeapTuple type_tuple;
+ Oid typoutput;
+ bool typisvarlena;
+ bool isnull;
+ Datum val;
+ char *outputstr = NULL;
+
+ attr = TupleDescAttr(tupdesc, i);
+
+ /* Figure out type name */
+ type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
+ if (HeapTupleIsValid(type_tuple))
+ {
+ /* Get information needed for printing values of a type */
+ getTypeOutputInfo(attr->atttypid, &typoutput, &typisvarlena);
+
+ val = heap_getattr(old_tuple, i + 1, tupdesc, &isnull);
+ if (!isnull)
+ {
+ outputstr = OidOutputFunctionCall(typoutput, val);
+ elog(DEBUG2, "row filter: REPLICA IDENTITY %s: %s", NameStr(attr->attname), outputstr);
+ pfree(outputstr);
+ }
+ }
+ }
+ }
+#endif
+
/* prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -378,6 +430,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Oid expr_type;
Datum res;
bool isnull;
+ char *s = NULL;
row_filter = (Node *) lfirst(lc);
@@ -388,14 +441,24 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
expr_state = ExecInitExpr(expr, NULL);
res = ExecEvalExpr(expr_state, ecxt, &isnull);
+ elog(DEBUG3, "row filter: result: %s ; isnull: %s", (DatumGetBool(res)) ? "true" : "false", (isnull) ? "true" : "false");
+
/* if tuple does not match row filter, bail out */
if (!DatumGetBool(res) || isnull)
{
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(row_filter)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was not matched", s);
+ pfree(s);
+
MemoryContextSwitchTo(oldcxt);
ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
FreeExecutorState(estate);
return;
}
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(row_filter)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was matched", s);
+ pfree(s);
}
MemoryContextSwitchTo(oldcxt);
@@ -666,9 +729,12 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
char *s = TextDatumGetCString(rf_datum);
+ char *t = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, rf_datum, ObjectIdGetDatum(entry->relid)));
Node *rf_node = stringToNode(s);
entry->row_filter = lappend(entry->row_filter, rf_node);
MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", t, pub->name, get_rel_name(relid));
}
ReleaseSysCache(rf_tuple);
--
2.19.1
v2-0009-Add-simple-BDR-test-for-row-filtering.patchtext/x-diff; name=v2-0009-Add-simple-BDR-test-for-row-filtering.patchDownload
From 59d97603c8b0ecc20a7f04acee2e123fb1a26265 Mon Sep 17 00:00:00 2001
From: Alexey Kondratov <kondratov.aleksey@gmail.com>
Date: Sun, 25 Aug 2019 16:21:32 +0300
Subject: [PATCH v2 9/9] Add simple BDR test for row filtering
---
.../{010_row_filter.pl => 013_row_filter.pl} | 0
src/test/subscription/t/014_simple_bdr.pl | 194 ++++++++++++++++++
2 files changed, 194 insertions(+)
rename src/test/subscription/t/{010_row_filter.pl => 013_row_filter.pl} (100%)
create mode 100644 src/test/subscription/t/014_simple_bdr.pl
diff --git a/src/test/subscription/t/010_row_filter.pl b/src/test/subscription/t/013_row_filter.pl
similarity index 100%
rename from src/test/subscription/t/010_row_filter.pl
rename to src/test/subscription/t/013_row_filter.pl
diff --git a/src/test/subscription/t/014_simple_bdr.pl b/src/test/subscription/t/014_simple_bdr.pl
new file mode 100644
index 0000000000..f33b754d2a
--- /dev/null
+++ b/src/test/subscription/t/014_simple_bdr.pl
@@ -0,0 +1,194 @@
+# Test simple bidirectional logical replication behavior with row filtering
+# ID is meant to be something like uuid (e.g. from pgcrypto), but integer
+# type is used for simplicity.
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+our $node_cloud;
+our $node_remote;
+our $cloud_appname = 'cloud_sub';
+our $remote_appname = 'remote_sub';
+
+sub check_data_consistency
+{
+ my $test_name = shift;
+ my $query = shift;
+ my $true_result = shift;
+ my $result;
+
+ $node_cloud->wait_for_catchup($remote_appname);
+ $node_remote->wait_for_catchup($cloud_appname);
+
+ $result =
+ $node_remote->safe_psql('postgres', $query);
+ is($result, $true_result, $test_name . ' on remote');
+ $result =
+ $node_cloud->safe_psql('postgres', $query);
+ is($result, $true_result, $test_name . ' on cloud');
+
+ return;
+}
+
+# Create cloud node
+$node_cloud = get_new_node('publisher');
+$node_cloud->init(allows_streaming => 'logical');
+$node_cloud->start;
+
+# Create remote node
+$node_remote = get_new_node('subscriber');
+$node_remote->init(allows_streaming => 'logical');
+$node_remote->start;
+
+# Test tables
+my $users_table = "CREATE TABLE users (
+ id integer primary key,
+ name text,
+ is_cloud boolean
+ );";
+my $docs_table = "CREATE TABLE docs (
+ id integer primary key,
+ user_id integer,
+ FOREIGN KEY (user_id) REFERENCES users (id),
+ content text,
+ is_cloud boolean
+ );";
+
+# Setup structure on cloud server
+$node_cloud->safe_psql('postgres', $users_table);
+
+# Setup structure on remote server
+$node_remote->safe_psql('postgres', $users_table);
+
+# Put in initial data
+$node_cloud->safe_psql('postgres',
+ "INSERT INTO users (id, name, is_cloud) VALUES (1, 'user1_on_cloud', TRUE);");
+$node_remote->safe_psql('postgres',
+ "INSERT INTO users (id, name, is_cloud) VALUES (2, 'user2_on_remote', FALSE);");
+$node_remote->safe_psql('postgres',
+ "INSERT INTO users (id, name, is_cloud) VALUES (100, 'user100_local_on_remote', TRUE);");
+
+# Setup logical replication
+$node_cloud->safe_psql('postgres', "CREATE PUBLICATION cloud;");
+$node_cloud->safe_psql('postgres', "ALTER PUBLICATION cloud ADD TABLE users WHERE (is_cloud IS TRUE);");
+
+$node_remote->safe_psql('postgres', "CREATE PUBLICATION remote;");
+$node_remote->safe_psql('postgres', "ALTER PUBLICATION remote ADD TABLE users WHERE (is_cloud IS FALSE);");
+
+my $cloud_connstr = $node_cloud->connstr . ' dbname=postgres';
+$node_remote->safe_psql('postgres',
+ "CREATE SUBSCRIPTION cloud_to_remote CONNECTION '$cloud_connstr application_name=$remote_appname' PUBLICATION cloud"
+);
+
+my $remote_connstr = $node_remote->connstr . ' dbname=postgres';
+$node_cloud->safe_psql('postgres',
+ "CREATE SUBSCRIPTION remote_to_cloud CONNECTION '$remote_connstr application_name=$cloud_appname' PUBLICATION remote"
+);
+
+# Wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_remote->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for remote to synchronize data";
+$node_cloud->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for cloud to synchronize data";
+
+$node_cloud->wait_for_catchup($remote_appname);
+$node_remote->wait_for_catchup($cloud_appname);
+
+# Test initial table sync
+my $result =
+ $node_remote->safe_psql('postgres', "SELECT count(*) from users");
+is($result, qq(3), 'check initial table sync on remote');
+$result =
+ $node_cloud->safe_psql('postgres', "SELECT count(*) from users");
+is($result, qq(2), 'check initial table sync on cloud');
+
+# Test BDR
+$node_cloud->safe_psql('postgres',
+ "INSERT INTO users (id, name, is_cloud) VALUES (3, 'user3_on_cloud', TRUE);");
+$node_cloud->safe_psql('postgres',
+ "INSERT INTO users (id, name, is_cloud) VALUES (4, 'user4_on_cloud', TRUE);");
+$node_remote->safe_psql('postgres',
+ "INSERT INTO users (id, name, is_cloud) VALUES (5, 'user5_on_remote', FALSE);");
+
+$node_cloud->wait_for_catchup($remote_appname);
+$node_remote->wait_for_catchup($cloud_appname);
+
+$result =
+ $node_remote->safe_psql('postgres', "SELECT id, name, is_cloud FROM users ORDER BY id;");
+is($result, qq(1|user1_on_cloud|t
+2|user2_on_remote|f
+3|user3_on_cloud|t
+4|user4_on_cloud|t
+5|user5_on_remote|f
+100|user100_local_on_remote|t), 'check users on remote');
+$result =
+ $node_cloud->safe_psql('postgres', "SELECT id, name, is_cloud FROM users ORDER BY id;");
+is($result, qq(1|user1_on_cloud|t
+2|user2_on_remote|f
+3|user3_on_cloud|t
+4|user4_on_cloud|t
+5|user5_on_remote|f), 'check users on cloud');
+
+# Add table to cloud server
+$node_cloud->safe_psql('postgres', $docs_table);
+
+# Add table to remote server
+$node_remote->safe_psql('postgres', $docs_table);
+
+# Put in initial data
+$node_cloud->safe_psql('postgres',
+ "INSERT INTO docs (id, user_id, content, is_cloud) VALUES (1, 3, 'user3__doc1_on_cloud', TRUE);");
+
+# Add table to publication
+$node_cloud->safe_psql('postgres', "ALTER PUBLICATION cloud ADD TABLE docs WHERE (is_cloud IS TRUE);");
+$node_remote->safe_psql('postgres', "ALTER PUBLICATION remote ADD TABLE docs WHERE (is_cloud IS FALSE);");
+
+# Refresh
+$node_cloud->safe_psql('postgres', "ALTER SUBSCRIPTION remote_to_cloud REFRESH PUBLICATION;");
+$node_remote->safe_psql('postgres', "ALTER SUBSCRIPTION cloud_to_remote REFRESH PUBLICATION;");
+
+# Test BDR on new table
+$node_cloud->safe_psql('postgres',
+ "INSERT INTO docs (id, user_id, content, is_cloud) VALUES (2, 3, 'user3__doc2_on_cloud', TRUE);");
+$node_remote->safe_psql('postgres',
+ "INSERT INTO docs (id, user_id, content, is_cloud) VALUES (3, 3, 'user3__doc3_on_remote', FALSE);");
+
+check_data_consistency(
+ 'check docs after insert',
+ "SELECT id, user_id, content, is_cloud FROM docs WHERE user_id = 3 ORDER BY id;",
+ qq(1|3|user3__doc1_on_cloud|t
+2|3|user3__doc2_on_cloud|t
+3|3|user3__doc3_on_remote|f)
+);
+
+# Test update of remote doc on cloud and vice versa
+$node_cloud->safe_psql('postgres',
+ "UPDATE docs SET content = 'user3__doc3_on_remote__updated', is_cloud = TRUE WHERE id = 3;");
+$node_remote->safe_psql('postgres',
+ "UPDATE docs SET content = 'user3__doc2_on_cloud__to_be_deleted', is_cloud = FALSE WHERE id = 2;");
+
+check_data_consistency(
+ 'check docs after update',
+ "SELECT id, user_id, content, is_cloud FROM docs WHERE user_id = 3 ORDER BY id;",
+ qq(1|3|user3__doc1_on_cloud|t
+2|3|user3__doc2_on_cloud__to_be_deleted|f
+3|3|user3__doc3_on_remote__updated|t)
+);
+
+# Test delete
+$node_remote->safe_psql('postgres',
+ "DELETE FROM docs WHERE id = 2;");
+
+check_data_consistency(
+ 'check docs after delete',
+ "SELECT id, user_id, content, is_cloud FROM docs WHERE user_id = 3 ORDER BY id;",
+ qq(1|3|user3__doc1_on_cloud|t
+3|3|user3__doc3_on_remote__updated|t)
+);
+
+$node_remote->stop('fast');
+$node_cloud->stop('fast');
--
2.19.1
Em dom, 3 de fev de 2019 às 07:14, Andres Freund <andres@anarazel.de> escreveu:
As far as I can tell, the patch has not been refreshed since. So I'm
marking this as returned with feedback for now. Please resubmit once
ready.
I fix all of the bugs pointed in this thread. I decide to disallow
UDFs in filters (it is safer for a first version). We can add this
functionality later. However, I'll check if allow "safe" functions
(aka builtin functions) are ok. I add more docs explaining that
expressions are executed with the role used for replication connection
and also that columns used in expressions must be part of PK or
REPLICA IDENTITY. I add regression tests.
Comments?
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachments:
0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchDownload
From 87945236590e9fd37b203d325b74dc5baccee64d Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 18:39:22 +0000
Subject: [PATCH 1/8] Remove unused atttypmod column from initial table
synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..0a565dd837 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -647,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
- Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -691,7 +691,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -703,7 +702,7 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->remoteid,
(walrcv_server_version(wrconn) >= 120000 ? "AND a.attgenerated = ''" : ""),
lrel->remoteid);
- res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+ res = walrcv_exec(wrconn, cmd.data, 3, attrRow);
if (res->status != WALRCV_OK_TUPLES)
ereport(ERROR,
@@ -724,7 +723,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.11.0
0003-Refactor-function-create_estate_for_relation.patchtext/x-patch; charset=US-ASCII; name=0003-Refactor-function-create_estate_for_relation.patchDownload
From 3a5b4c541982357c2231b9882ac01f1f0d0a8e29 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH 3/8] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 43edfef089..31fc7c5048 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -173,7 +173,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -183,13 +183,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -589,7 +589,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -696,7 +696,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -815,7 +815,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.11.0
0004-Rename-a-WHERE-node.patchtext/x-patch; charset=US-ASCII; name=0004-Rename-a-WHERE-node.patchDownload
From 7ef5ccffcb7bc71d298427e7b2c3a2cfae8556c6 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3dc0e8a4fb..61cc59fe7c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -476,7 +476,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3710,7 +3710,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3812,7 +3812,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.11.0
0002-Store-number-of-tuples-in-WalRcvExecResult.patchtext/x-patch; charset=US-ASCII; name=0002-Store-number-of-tuples-in-WalRcvExecResult.patchDownload
From 4b5ca55f83e8036d6892a458bf73c891329c01f8 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 17:37:36 +0000
Subject: [PATCH 2/8] Store number of tuples in WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 5 +++--
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 765d58d120..e657177c00 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -878,6 +878,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -888,7 +889,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -897,7 +898,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
char *cstrs[MaxTupleAttributeNumber];
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 0a565dd837..42db4ada9e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -709,9 +709,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 7f2927cb46..d0fb98df09 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -197,6 +197,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* libpqwalreceiver hooks */
--
2.11.0
0005-Row-filtering-for-logical-replication.patchtext/x-patch; charset=US-ASCII; name=0005-Row-filtering-for-logical-replication.patchDownload
From fc1090c6922d1a66d3ad03d21441829f8cae0472 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH 5/8] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/catalogs.sgml | 9 +++
doc/src/sgml/ref/alter_publication.sgml | 11 ++-
doc/src/sgml/ref/create_publication.sgml | 26 +++++-
src/backend/catalog/pg_publication.c | 102 ++++++++++++++++++++++--
src/backend/commands/publicationcmds.c | 93 +++++++++++++++-------
src/backend/parser/gram.y | 26 ++++--
src/backend/parser/parse_agg.c | 10 +++
src/backend/parser/parse_expr.c | 14 +++-
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 119 +++++++++++++++++++++++++---
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 100 ++++++++++++++++++++++-
src/include/catalog/pg_publication.h | 9 ++-
src/include/catalog/pg_publication_rel.h | 10 ++-
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 ++-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 29 +++++++
src/test/regress/sql/publication.sql | 21 +++++
src/test/subscription/t/013_row_filter.pl | 96 ++++++++++++++++++++++
22 files changed, 629 insertions(+), 67 deletions(-)
create mode 100644 src/test/subscription/t/013_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 4c7e93892a..88177279c7 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5587,6 +5587,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.oid</literal></entry>
<entry>Reference to relation</entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
+ qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598d93..9608448207 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..6e99943374 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">expression</replaceable> will not be
+ published. Note that parentheses are required around the expression.
</para>
<para>
@@ -157,6 +160,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -171,6 +181,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -184,6 +199,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index f8475c1aba..ff30fdd9f6 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,10 @@
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -149,18 +153,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -180,10 +187,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -197,6 +221,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -213,11 +243,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -292,6 +328,62 @@ GetPublicationRelations(Oid pubid)
}
/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = heap_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ PublicationRelationQual *relqual;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
+ }
+ else
+ qual_expr = NULL;
+
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(pubrel->prrelid, ShareUpdateExclusiveLock);
+ relqual->whereClause = copyObject(qual_expr);
+ result = lappend(result, relqual);
+ }
+
+ systable_endscan(scan);
+ heap_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
+/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
List *
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 4d48be0b92..6d56893c3e 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -344,6 +344,27 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid the
+ * WHERE clause in it. The use of relation_expr_list node just for the
+ * DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -352,47 +373,55 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
PublicationDropTables(pubid, rels, false);
else /* DEFELEM_SET */
{
- List *oldrelids = GetPublicationRelations(pubid);
+ List *oldrels = GetPublicationRelationQuals(pubid);
List *delrels = NIL;
ListCell *oldlc;
/* Calculate which relations to drop. */
- foreach(oldlc, oldrelids)
+ foreach(oldlc, oldrels)
{
- Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelationQual *oldrel = lfirst(oldlc);
+ PublicationRelationQual *newrel;
ListCell *newlc;
bool found = false;
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == RelationGetRelid(oldrel->relation))
{
found = true;
break;
}
}
- if (!found)
+ /*
+ * Remove publication / relation mapping iif (i) table is not found in
+ * the new list or (ii) table is found in the new list, however,
+ * its qual does not match the old one (in this case, a simple
+ * tuple update is not enough because of the dependencies).
+ */
+ if (!found || (found && !equal(oldrel->whereClause, newrel->whereClause)))
{
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
+ PublicationRelationQual *oldrelqual = palloc(sizeof(PublicationRelationQual));
+ oldrelqual->relation = table_open(RelationGetRelid(oldrel->relation),
+ ShareUpdateExclusiveLock);
- delrels = lappend(delrels, oldrel);
+ delrels = lappend(delrels, oldrelqual);
}
}
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
+ CloseTableList(oldrels);
+ CloseTableList(delrels);
/*
* Don't bother calculating the difference for adding, we'll catch and
* skip existing ones when doing catalog update.
*/
PublicationAddTables(pubid, rels, true, stmt);
-
- CloseTableList(delrels);
}
CloseTableList(rels);
@@ -502,16 +531,18 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
- Relation rel;
- Oid myrelid;
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
+ bool recurse = rv->inh;
+ Relation rel;
+ Oid myrelid;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -531,8 +562,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/* Add children of this rel, if requested */
@@ -560,7 +593,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -581,10 +618,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -600,13 +639,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -632,8 +671,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -646,7 +685,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 61cc59fe7c..2580da9deb 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -404,13 +404,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9518,7 +9518,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9549,7 +9549,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9557,7 +9557,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9565,7 +9565,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9575,6 +9575,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index c745fcdd2b..b11d159b54 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 8e136a7981..f82518afc8 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -170,6 +170,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in WHERE"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -571,6 +578,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1924,13 +1932,15 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
- break;
case EXPR_KIND_COPY_WHERE:
err = _("cannot use subquery in COPY FROM WHERE condition");
break;
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3563,6 +3573,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 752cf1b315..50653a89d8 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2529,6 +2529,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 42db4ada9e..5468b694f6 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -637,19 +637,26 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -713,20 +720,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -734,7 +741,46 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd, "SELECT pg_get_expr(prqual, prrelid) FROM pg_publication p INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) WHERE pr.prrelid = %u AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
walrcv_clear_result(res);
pfree(cmd.data);
@@ -750,6 +796,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyState cstate;
@@ -758,7 +805,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -767,10 +814,57 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- appendStringInfo(&cmd, "COPY %s TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /*
+ * If publication has any row filter, build a SELECT query with OR'ed row
+ * filters for COPY. If no row filters are available, use COPY for all
+ * table contents.
+ */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first;
+
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
+ appendStringInfo(&cmd, " FROM %s",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfoString(&cmd, " WHERE ");
+ /* list of OR'ed filters */
+ first = true;
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
+ list_free_deep(qual);
+ }
+ else
+ {
+ appendStringInfo(&cmd, "COPY %s TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ }
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -785,7 +879,6 @@ copy_table(Relation rel)
addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 31fc7c5048..22b95d52b5 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -172,7 +172,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 63687a97ec..49f533280b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,13 +12,24 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
#include "utils/memutils.h"
@@ -58,6 +69,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -333,6 +345,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsVirtual);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach (lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+
+ if (!matched)
+ return;
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -568,10 +639,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
if (pub->alltables || list_member_oid(pubids, pub->oid))
{
@@ -581,9 +656,23 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prqual, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+ entry->qual = lappend(entry->qual, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -658,5 +747,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 2dad24fc9f..74ab2c25d1 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -78,15 +78,22 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelationQuals(Oid pubid);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92ab3..a75b2d5345 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -28,9 +28,13 @@
*/
CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
{
- Oid oid; /* oid */
- Oid prpubid; /* Oid of the publication */
- Oid prrelid; /* Oid of the relation */
+ Oid oid; /* oid */
+ Oid prpubid; /* Oid of the publication */
+ Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index 5ee628c837..aedf27b483 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -66,6 +66,7 @@ DECLARE_TOAST(pg_namespace, 4163, 4164);
DECLARE_TOAST(pg_partitioned_table, 4165, 4166);
DECLARE_TOAST(pg_policy, 4167, 4168);
DECLARE_TOAST(pg_proc, 2836, 2837);
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
DECLARE_TOAST(pg_statistic, 2840, 2841);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index ffb4cd4bcc..4e624317b0 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -476,6 +476,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 12e9730dd0..91cd750047 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3462,12 +3462,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3480,7 +3487,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 3d8039aa51..048a445030 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -73,6 +73,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 85e0b6ea62..29af52ce3a 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index afbbdd543d..cf67b7b186 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -107,6 +107,35 @@ Tables:
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in WHERE
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f | t | t | t | t
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 815410b3c5..20c874eb67 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -60,6 +60,27 @@ CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1, pub_test.testpub_nopk;
diff --git a/src/test/subscription/t/013_row_filter.pl b/src/test/subscription/t/013_row_filter.pl
new file mode 100644
index 0000000000..99e6db94d6
--- /dev/null
+++ b/src/test/subscription/t/013_row_filter.pl
@@ -0,0 +1,96 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+
+my $result = $node_publisher->psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 DROP TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+is($result, 3, "syntax error for ALTER PUBLICATION DROP TABLE");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3");
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)");
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 10)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+#$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter_1");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(7|2|10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.11.0
0007-Publication-where-condition-support-for-pg_dump.patchtext/x-patch; charset=US-ASCII; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From 2548f727b6f5d61e87de70ec661335fe82ce6ffa Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Sat, 15 Sep 2018 02:52:00 +0000
Subject: [PATCH 7/8] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8b993d6eae..b41d9fd477 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3930,6 +3930,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -3962,7 +3963,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prqual, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -3983,6 +3985,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -3998,6 +4001,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4026,8 +4034,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as drop query as the drop is
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 4f9ebb4904..d03eaa1dca 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -612,6 +612,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.11.0
0006-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; charset=US-ASCII; name=0006-Print-publication-WHERE-condition-in-psql.patchDownload
From 72cfcf940ae2100f0bf8728983bf7313a2d9ef83 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Thu, 17 May 2018 20:52:28 +0000
Subject: [PATCH 6/8] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ee00c5da08..872a544410 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5816,7 +5816,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5846,6 +5847,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.11.0
0008-Debug-for-row-filtering.patchtext/x-patch; charset=US-ASCII; name=0008-Debug-for-row-filtering.patchDownload
From c947168de90527d30d850f8c5c4bd6e090521500 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 14 Mar 2018 00:53:17 +0000
Subject: [PATCH 8/8] Debug for row filtering
---
src/backend/commands/publicationcmds.c | 11 +++++
src/backend/replication/logical/tablesync.c | 1 +
src/backend/replication/pgoutput/pgoutput.c | 66 +++++++++++++++++++++++++++++
3 files changed, 78 insertions(+)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 6d56893c3e..65294f2100 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -333,6 +333,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ ListCell *lc;
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -344,6 +345,16 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause == NULL)
+ elog(DEBUG3, "publication \"%s\" has no WHERE clause", NameStr(pubform->pubname));
+ else
+ elog(DEBUG3, "publication \"%s\" has WHERE clause", NameStr(pubform->pubname));
+ }
+
/*
* ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
* publication_table_list node (that accepts a WHERE clause) but forbid the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 5468b694f6..1fc7d5647b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -865,6 +865,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
}
+ elog(DEBUG2, "COPY for initial synchronization: %s", cmd.data);
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 49f533280b..4aff5cb515 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -32,6 +32,7 @@
#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -321,6 +322,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
MemoryContext old;
RelationSyncEntry *relentry;
+ Form_pg_class class_form;
+ char *schemaname;
+ char *tablename;
+
if (!is_publishable_relation(relation))
return;
@@ -345,6 +350,17 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ class_form = RelationGetForm(relation);
+ schemaname = get_namespace_name(class_form->relnamespace);
+ tablename = NameStr(class_form->relname);
+
+ if (change->action == REORDER_BUFFER_CHANGE_INSERT)
+ elog(DEBUG1, "INSERT \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_UPDATE)
+ elog(DEBUG1, "UPDATE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_DELETE)
+ elog(DEBUG1, "DELETE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+
/* ... then check row filter */
if (list_length(relentry->qual) > 0)
{
@@ -362,6 +378,42 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
+#ifdef _NOT_USED
+ if (old_tuple)
+ {
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute attr;
+ HeapTuple type_tuple;
+ Oid typoutput;
+ bool typisvarlena;
+ bool isnull;
+ Datum val;
+ char *outputstr = NULL;
+
+ attr = TupleDescAttr(tupdesc, i);
+
+ /* Figure out type name */
+ type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
+ if (HeapTupleIsValid(type_tuple))
+ {
+ /* Get information needed for printing values of a type */
+ getTypeOutputInfo(attr->atttypid, &typoutput, &typisvarlena);
+
+ val = heap_getattr(old_tuple, i + 1, tupdesc, &isnull);
+ if (!isnull)
+ {
+ outputstr = OidOutputFunctionCall(typoutput, val);
+ elog(DEBUG2, "row filter: REPLICA IDENTITY %s: %s", NameStr(attr->attname), outputstr);
+ pfree(outputstr);
+ }
+ }
+ }
+ }
+#endif
+
/* prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -377,6 +429,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Oid expr_type;
Datum res;
bool isnull;
+ char *s = NULL;
qual = (Node *) lfirst(lc);
@@ -387,12 +440,22 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
expr_state = ExecInitExpr(expr, NULL);
res = ExecEvalExpr(expr_state, ecxt, &isnull);
+ elog(DEBUG3, "row filter: result: %s ; isnull: %s", (DatumGetBool(res)) ? "true" : "false", (isnull) ? "true" : "false");
+
/* if tuple does not match row filter, bail out */
if (!DatumGetBool(res) || isnull)
{
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was not matched", s);
+ pfree(s);
+
matched = false;
break;
}
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was matched", s);
+ pfree(s);
}
MemoryContextSwitchTo(oldcxt);
@@ -666,9 +729,12 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
char *s = TextDatumGetCString(rf_datum);
+ char *t = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, rf_datum, ObjectIdGetDatum(entry->relid)));
Node *rf_node = stringToNode(s);
entry->qual = lappend(entry->qual, rf_node);
MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", t, pub->name, get_rel_name(relid));
}
ReleaseSysCache(rf_tuple);
--
2.11.0
Em ter, 27 de ago de 2019 às 18:10, <a.kondratov@postgrespro.ru> escreveu:
Do you have any plans for continuing working on this patch and
submitting it again on the closest September commitfest? There are only
a few days left. Anyway, I will be glad to review the patch if you do
submit it, though I didn't yet dig deeply into the code.
Sure. See my last email to this thread. I appreciate if you can review it.
Although almost all new tests are passed, there is a problem with DELETE
replication, so 1 out of 10 tests is failed. It isn't replicated if the
record was created with is_cloud=TRUE on cloud, replicated to remote;
then updated with is_cloud=FALSE on remote, replicated to cloud; then
deleted on remote.
That's because you don't include is_cloud in PK or REPLICA IDENTITY. I
add a small note in docs.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
I think that I also have found one shortcoming when using the setup described by Alexey Kondratov. The problem that I face is that if both (cloud and remote) tables already have data the moment I add the subscription, then the whole table is copied in both directions initially. Which leads to duplicated data and broken replication because COPY doesn't take into account the filtering condition. In case there are filters in a publication, the COPY command that is executed when adding a subscription (or altering one to refresh a publication) should also filter the data based on the same condition, e.g. COPY (SELECT * FROM ... WHERE ...) TO ...
The current workaround is to always use WITH copy_data = false when subscribing or refreshing, and then manually copy data with the above statement.
Alexey Zagarin
Show quoted text
On 1 Sep 2019 12:11 +0700, Euler Taveira <euler@timbira.com.br>, wrote:
Em ter, 27 de ago de 2019 às 18:10, <a.kondratov@postgrespro.ru> escreveu:
Do you have any plans for continuing working on this patch and
submitting it again on the closest September commitfest? There are only
a few days left. Anyway, I will be glad to review the patch if you do
submit it, though I didn't yet dig deeply into the code.Sure. See my last email to this thread. I appreciate if you can review it.
Although almost all new tests are passed, there is a problem with DELETE
replication, so 1 out of 10 tests is failed. It isn't replicated if the
record was created with is_cloud=TRUE on cloud, replicated to remote;
then updated with is_cloud=FALSE on remote, replicated to cloud; then
deleted on remote.That's because you don't include is_cloud in PK or REPLICA IDENTITY. I
add a small note in docs.--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On 2019-09-01 02:28, Euler Taveira wrote:
Em dom, 3 de fev de 2019 às 07:14, Andres Freund <andres@anarazel.de>
escreveu:As far as I can tell, the patch has not been refreshed since. So I'm
marking this as returned with feedback for now. Please resubmit once
ready.I fix all of the bugs pointed in this thread. I decide to disallow
0001-Remove-unused-atttypmod-column-from-initial-table-sy.patch
0002-Store-number-of-tuples-in-WalRcvExecResult.patch
0003-Refactor-function-create_estate_for_relation.patch
0004-Rename-a-WHERE-node.patch
0005-Row-filtering-for-logical-replication.patch
0006-Print-publication-WHERE-condition-in-psql.patch
0007-Publication-where-condition-support-for-pg_dump.patch
0008-Debug-for-row-filtering.patch
Hi,
The first 4 of these apply without error, but I can't get 0005 to apply.
This is what I use:
patch --dry-run -b -l -F 5 -p 1 <
/home/aardvark/download/pgpatches/0130/logrep_rowfilter/20190901/0005-Row-filtering-for-logical-replication.patch
checking file doc/src/sgml/catalogs.sgml
Hunk #1 succeeded at 5595 (offset 8 lines).
checking file doc/src/sgml/ref/alter_publication.sgml
checking file doc/src/sgml/ref/create_publication.sgml
checking file src/backend/catalog/pg_publication.c
checking file src/backend/commands/publicationcmds.c
Hunk #1 succeeded at 352 (offset 8 lines).
Hunk #2 succeeded at 381 (offset 8 lines).
Hunk #3 succeeded at 539 (offset 8 lines).
Hunk #4 succeeded at 570 (offset 8 lines).
Hunk #5 succeeded at 601 (offset 8 lines).
Hunk #6 succeeded at 626 (offset 8 lines).
Hunk #7 succeeded at 647 (offset 8 lines).
Hunk #8 succeeded at 679 (offset 8 lines).
Hunk #9 succeeded at 693 (offset 8 lines).
checking file src/backend/parser/gram.y
checking file src/backend/parser/parse_agg.c
checking file src/backend/parser/parse_expr.c
Hunk #4 succeeded at 3571 (offset -2 lines).
checking file src/backend/parser/parse_func.c
Hunk #1 succeeded at 2516 (offset -13 lines).
checking file src/backend/replication/logical/tablesync.c
checking file src/backend/replication/logical/worker.c
checking file src/backend/replication/pgoutput/pgoutput.c
Hunk #1 FAILED at 12.
Hunk #2 succeeded at 60 (offset 2 lines).
Hunk #3 succeeded at 336 (offset 2 lines).
Hunk #4 succeeded at 630 (offset 2 lines).
Hunk #5 succeeded at 647 (offset 2 lines).
Hunk #6 succeeded at 738 (offset 2 lines).
1 out of 6 hunks FAILED
checking file src/include/catalog/pg_publication.h
checking file src/include/catalog/pg_publication_rel.h
checking file src/include/catalog/toasting.h
checking file src/include/nodes/nodes.h
checking file src/include/nodes/parsenodes.h
Hunk #1 succeeded at 3461 (offset -1 lines).
Hunk #2 succeeded at 3486 (offset -1 lines).
checking file src/include/parser/parse_node.h
checking file src/include/replication/logicalrelation.h
checking file src/test/regress/expected/publication.out
Hunk #1 succeeded at 116 (offset 9 lines).
checking file src/test/regress/sql/publication.sql
Hunk #1 succeeded at 69 with fuzz 1 (offset 9 lines).
checking file src/test/subscription/t/013_row_filter.pl
perhaps that can be fixed?
thanks,
Erik Rijkers
Em dom, 1 de set de 2019 às 06:09, Erik Rijkers <er@xs4all.nl> escreveu:
The first 4 of these apply without error, but I can't get 0005 to apply.
This is what I use:
Erik, I generate a new patch set with patience diff algorithm. It
seems it applies cleanly.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachments:
0008-Debug-for-row-filtering.patchtext/x-patch; charset=US-ASCII; name=0008-Debug-for-row-filtering.patchDownload
From 3ce3e2e31511f7a0a2a93b0709b050794bfaf0b9 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 14 Mar 2018 00:53:17 +0000
Subject: [PATCH 8/8] Debug for row filtering
---
src/backend/commands/publicationcmds.c | 11 +++++
src/backend/replication/logical/tablesync.c | 1 +
src/backend/replication/pgoutput/pgoutput.c | 66 +++++++++++++++++++++++++++++
3 files changed, 78 insertions(+)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 716ed2ec58..d0406d14d7 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -341,6 +341,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ ListCell *lc;
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -352,6 +353,16 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause == NULL)
+ elog(DEBUG3, "publication \"%s\" has no WHERE clause", NameStr(pubform->pubname));
+ else
+ elog(DEBUG3, "publication \"%s\" has WHERE clause", NameStr(pubform->pubname));
+ }
+
/*
* ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
* publication_table_list node (that accepts a WHERE clause) but forbid the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 5468b694f6..1fc7d5647b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -865,6 +865,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
}
+ elog(DEBUG2, "COPY for initial synchronization: %s", cmd.data);
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 83fad465f8..509124b9cf 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -34,6 +34,7 @@
#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -323,6 +324,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
MemoryContext old;
RelationSyncEntry *relentry;
+ Form_pg_class class_form;
+ char *schemaname;
+ char *tablename;
+
if (!is_publishable_relation(relation))
return;
@@ -347,6 +352,17 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ class_form = RelationGetForm(relation);
+ schemaname = get_namespace_name(class_form->relnamespace);
+ tablename = NameStr(class_form->relname);
+
+ if (change->action == REORDER_BUFFER_CHANGE_INSERT)
+ elog(DEBUG1, "INSERT \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_UPDATE)
+ elog(DEBUG1, "UPDATE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_DELETE)
+ elog(DEBUG1, "DELETE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+
/* ... then check row filter */
if (list_length(relentry->qual) > 0)
{
@@ -364,6 +380,42 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
+#ifdef _NOT_USED
+ if (old_tuple)
+ {
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute attr;
+ HeapTuple type_tuple;
+ Oid typoutput;
+ bool typisvarlena;
+ bool isnull;
+ Datum val;
+ char *outputstr = NULL;
+
+ attr = TupleDescAttr(tupdesc, i);
+
+ /* Figure out type name */
+ type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
+ if (HeapTupleIsValid(type_tuple))
+ {
+ /* Get information needed for printing values of a type */
+ getTypeOutputInfo(attr->atttypid, &typoutput, &typisvarlena);
+
+ val = heap_getattr(old_tuple, i + 1, tupdesc, &isnull);
+ if (!isnull)
+ {
+ outputstr = OidOutputFunctionCall(typoutput, val);
+ elog(DEBUG2, "row filter: REPLICA IDENTITY %s: %s", NameStr(attr->attname), outputstr);
+ pfree(outputstr);
+ }
+ }
+ }
+ }
+#endif
+
/* prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -379,6 +431,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Oid expr_type;
Datum res;
bool isnull;
+ char *s = NULL;
qual = (Node *) lfirst(lc);
@@ -389,12 +442,22 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
expr_state = ExecInitExpr(expr, NULL);
res = ExecEvalExpr(expr_state, ecxt, &isnull);
+ elog(DEBUG3, "row filter: result: %s ; isnull: %s", (DatumGetBool(res)) ? "true" : "false", (isnull) ? "true" : "false");
+
/* if tuple does not match row filter, bail out */
if (!DatumGetBool(res) || isnull)
{
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was not matched", s);
+ pfree(s);
+
matched = false;
break;
}
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was matched", s);
+ pfree(s);
}
MemoryContextSwitchTo(oldcxt);
@@ -668,9 +731,12 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
char *s = TextDatumGetCString(rf_datum);
+ char *t = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, rf_datum, ObjectIdGetDatum(entry->relid)));
Node *rf_node = stringToNode(s);
entry->qual = lappend(entry->qual, rf_node);
MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", t, pub->name, get_rel_name(relid));
}
ReleaseSysCache(rf_tuple);
--
2.11.0
0004-Rename-a-WHERE-node.patchtext/x-patch; charset=US-ASCII; name=0004-Rename-a-WHERE-node.patchDownload
From c07af2f00b7a72ba9660e389bb1392fc9e5d2688 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb367f8..1de8f56794 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -476,7 +476,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3710,7 +3710,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3812,7 +3812,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.11.0
0003-Refactor-function-create_estate_for_relation.patchtext/x-patch; charset=US-ASCII; name=0003-Refactor-function-create_estate_for_relation.patchDownload
From 367631ac4ba1e41170d59d39693e2eaf7c406621 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH 3/8] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 11e6331f49..d9952c8b7e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -173,7 +173,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -183,13 +183,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -589,7 +589,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -696,7 +696,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -815,7 +815,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.11.0
0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchDownload
From e83de1cab559f4ca8f9a75356e220356814cd243 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 18:39:22 +0000
Subject: [PATCH 1/8] Remove unused atttypmod column from initial table
synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..0a565dd837 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -647,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
- Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -691,7 +691,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -703,7 +702,7 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->remoteid,
(walrcv_server_version(wrconn) >= 120000 ? "AND a.attgenerated = ''" : ""),
lrel->remoteid);
- res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+ res = walrcv_exec(wrconn, cmd.data, 3, attrRow);
if (res->status != WALRCV_OK_TUPLES)
ereport(ERROR,
@@ -724,7 +723,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.11.0
0002-Store-number-of-tuples-in-WalRcvExecResult.patchtext/x-patch; charset=US-ASCII; name=0002-Store-number-of-tuples-in-WalRcvExecResult.patchDownload
From f6795f1b6efab48caa07487d9186f844dd48fc65 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 17:37:36 +0000
Subject: [PATCH 2/8] Store number of tuples in WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 5 +++--
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..343550a335 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -878,6 +878,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -888,7 +889,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -897,7 +898,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
char *cstrs[MaxTupleAttributeNumber];
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 0a565dd837..42db4ada9e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -709,9 +709,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e12a934966..0d32d598d8 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -196,6 +196,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* libpqwalreceiver hooks */
--
2.11.0
0005-Row-filtering-for-logical-replication.patchtext/x-patch; charset=US-ASCII; name=0005-Row-filtering-for-logical-replication.patchDownload
From 44f401ecb53dd43e3a7a05912e25d644748f936f Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH 5/8] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/catalogs.sgml | 9 +++
doc/src/sgml/ref/alter_publication.sgml | 11 ++-
doc/src/sgml/ref/create_publication.sgml | 26 +++++-
src/backend/catalog/pg_publication.c | 102 ++++++++++++++++++++++--
src/backend/commands/publicationcmds.c | 93 +++++++++++++++-------
src/backend/parser/gram.y | 26 ++++--
src/backend/parser/parse_agg.c | 10 +++
src/backend/parser/parse_expr.c | 14 +++-
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 119 +++++++++++++++++++++++++---
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 100 ++++++++++++++++++++++-
src/include/catalog/pg_publication.h | 9 ++-
src/include/catalog/pg_publication_rel.h | 10 ++-
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 ++-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 29 +++++++
src/test/regress/sql/publication.sql | 21 +++++
src/test/subscription/t/013_row_filter.pl | 96 ++++++++++++++++++++++
22 files changed, 629 insertions(+), 67 deletions(-)
create mode 100644 src/test/subscription/t/013_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 5e71a2e865..7f11225f65 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5595,6 +5595,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.oid</literal></entry>
<entry>Reference to relation</entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
+ qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598d93..9608448207 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..6e99943374 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">expression</replaceable> will not be
+ published. Note that parentheses are required around the expression.
</para>
<para>
@@ -157,6 +160,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -171,6 +181,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -184,6 +199,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d5f7..c873419a9e 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,10 @@
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -149,18 +153,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -180,10 +187,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -197,6 +221,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -213,11 +243,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -292,6 +328,62 @@ GetPublicationRelations(Oid pubid)
}
/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = heap_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ PublicationRelationQual *relqual;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
+ }
+ else
+ qual_expr = NULL;
+
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(pubrel->prrelid, ShareUpdateExclusiveLock);
+ relqual->whereClause = copyObject(qual_expr);
+ result = lappend(result, relqual);
+ }
+
+ systable_endscan(scan);
+ heap_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
+/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
List *
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4bf80..716ed2ec58 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -352,6 +352,27 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid the
+ * WHERE clause in it. The use of relation_expr_list node just for the
+ * DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -360,47 +381,55 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
PublicationDropTables(pubid, rels, false);
else /* DEFELEM_SET */
{
- List *oldrelids = GetPublicationRelations(pubid);
+ List *oldrels = GetPublicationRelationQuals(pubid);
List *delrels = NIL;
ListCell *oldlc;
/* Calculate which relations to drop. */
- foreach(oldlc, oldrelids)
+ foreach(oldlc, oldrels)
{
- Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelationQual *oldrel = lfirst(oldlc);
+ PublicationRelationQual *newrel;
ListCell *newlc;
bool found = false;
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == RelationGetRelid(oldrel->relation))
{
found = true;
break;
}
}
- if (!found)
+ /*
+ * Remove publication / relation mapping iif (i) table is not found in
+ * the new list or (ii) table is found in the new list, however,
+ * its qual does not match the old one (in this case, a simple
+ * tuple update is not enough because of the dependencies).
+ */
+ if (!found || (found && !equal(oldrel->whereClause, newrel->whereClause)))
{
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
+ PublicationRelationQual *oldrelqual = palloc(sizeof(PublicationRelationQual));
+ oldrelqual->relation = table_open(RelationGetRelid(oldrel->relation),
+ ShareUpdateExclusiveLock);
- delrels = lappend(delrels, oldrel);
+ delrels = lappend(delrels, oldrelqual);
}
}
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
+ CloseTableList(oldrels);
+ CloseTableList(delrels);
/*
* Don't bother calculating the difference for adding, we'll catch and
* skip existing ones when doing catalog update.
*/
PublicationAddTables(pubid, rels, true, stmt);
-
- CloseTableList(delrels);
}
CloseTableList(rels);
@@ -510,16 +539,18 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
- Relation rel;
- Oid myrelid;
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
+ bool recurse = rv->inh;
+ Relation rel;
+ Oid myrelid;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -539,8 +570,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/* Add children of this rel, if requested */
@@ -568,7 +601,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -589,10 +626,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -608,13 +647,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -640,8 +679,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -654,7 +693,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 1de8f56794..bd87e80e1b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -404,13 +404,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9518,7 +9518,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9549,7 +9549,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9557,7 +9557,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9565,7 +9565,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9575,6 +9575,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index f418c61545..dea5aadca7 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 76f3dd7076..6d2c6a28ea 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -170,6 +170,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in WHERE"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -571,6 +578,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1924,13 +1932,15 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
- break;
case EXPR_KIND_COPY_WHERE:
err = _("cannot use subquery in COPY FROM WHERE condition");
break;
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3561,6 +3571,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 8e926539e6..66458d8a48 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2516,6 +2516,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 42db4ada9e..5468b694f6 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -637,19 +637,26 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -713,20 +720,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -734,7 +741,46 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd, "SELECT pg_get_expr(prqual, prrelid) FROM pg_publication p INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) WHERE pr.prrelid = %u AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
walrcv_clear_result(res);
pfree(cmd.data);
@@ -750,6 +796,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyState cstate;
@@ -758,7 +805,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -767,10 +814,57 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- appendStringInfo(&cmd, "COPY %s TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /*
+ * If publication has any row filter, build a SELECT query with OR'ed row
+ * filters for COPY. If no row filters are available, use COPY for all
+ * table contents.
+ */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first;
+
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
+ appendStringInfo(&cmd, " FROM %s",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfoString(&cmd, " WHERE ");
+ /* list of OR'ed filters */
+ first = true;
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
+ list_free_deep(qual);
+ }
+ else
+ {
+ appendStringInfo(&cmd, "COPY %s TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ }
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -785,7 +879,6 @@ copy_table(Relation rel)
addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d9952c8b7e..cef0c52955 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -172,7 +172,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..83fad465f8 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,15 +12,26 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "fmgr.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
#include "utils/memutils.h"
@@ -60,6 +71,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -335,6 +347,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsVirtual);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach (lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+
+ if (!matched)
+ return;
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -570,10 +641,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
if (pub->alltables || list_member_oid(pubids, pub->oid))
{
@@ -583,9 +658,23 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prqual, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+ entry->qual = lappend(entry->qual, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -660,5 +749,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..ae8d2845a0 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -78,15 +78,22 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelationQuals(Oid pubid);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92ab3..a75b2d5345 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -28,9 +28,13 @@
*/
CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
{
- Oid oid; /* oid */
- Oid prpubid; /* Oid of the publication */
- Oid prrelid; /* Oid of the relation */
+ Oid oid; /* oid */
+ Oid prpubid; /* Oid of the publication */
+ Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index cc5dfed0bf..d57ca82e89 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -66,6 +66,7 @@ DECLARE_TOAST(pg_namespace, 4163, 4164);
DECLARE_TOAST(pg_partitioned_table, 4165, 4166);
DECLARE_TOAST(pg_policy, 4167, 4168);
DECLARE_TOAST(pg_proc, 2836, 2837);
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
DECLARE_TOAST(pg_statistic, 2840, 2841);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 3cbb08df92..7f83da1ee8 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -476,6 +476,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 94ded3c135..359f773092 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3461,12 +3461,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3479,7 +3486,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 7c099e7084..c2e8b9fcb9 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -73,6 +73,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 2642a3f94e..5cc307ee0e 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..202173c376 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,35 @@ Tables:
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in WHERE
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f | t | t | t | t
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..6f0d088984 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,27 @@ RESET client_min_messages;
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/013_row_filter.pl b/src/test/subscription/t/013_row_filter.pl
new file mode 100644
index 0000000000..99e6db94d6
--- /dev/null
+++ b/src/test/subscription/t/013_row_filter.pl
@@ -0,0 +1,96 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+
+my $result = $node_publisher->psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 DROP TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+is($result, 3, "syntax error for ALTER PUBLICATION DROP TABLE");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3");
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)");
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 10)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+#$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter_1");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(7|2|10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.11.0
0006-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; charset=US-ASCII; name=0006-Print-publication-WHERE-condition-in-psql.patchDownload
From 8558d7b9913311255b0a4f349ebc0f60f6a83393 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Thu, 17 May 2018 20:52:28 +0000
Subject: [PATCH 6/8] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 774cc764ff..819b74bf4c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5875,7 +5875,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5905,6 +5906,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.11.0
0007-Publication-where-condition-support-for-pg_dump.patchtext/x-patch; charset=US-ASCII; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From df95a451055ae5325a7494b355f2ef3383dbf304 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Sat, 15 Sep 2018 02:52:00 +0000
Subject: [PATCH 7/8] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 34981401bf..0507441209 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3959,6 +3959,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -3991,7 +3992,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prqual, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -4012,6 +4014,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4027,6 +4030,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4055,8 +4063,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index dfba58ac58..3ed2e1be9c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -608,6 +608,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.11.0
On 2019-09-02 01:43, Euler Taveira wrote:
Em dom, 1 de set de 2019 às 06:09, Erik Rijkers <er@xs4all.nl>
escreveu:The first 4 of these apply without error, but I can't get 0005 to
apply.
This is what I use:Erik, I generate a new patch set with patience diff algorithm. It
seems it applies cleanly.
It did apply cleanly, thanks.
But I can't get it to correctly do the partial replication in the
attached pgbench-script (similar versions of which script I also used
for earlier versions of the patch, last year).
There are complaints in the log (both pub and sub) like:
ERROR: trying to store a heap tuple into wrong type of slot
I have no idea what causes that.
I attach a zip:
$ unzip -l logrep_rowfilter.zip
Archive: logrep_rowfilter.zip
Length Date Time Name
--------- ---------- ----- ----
17942 2019-09-03 00:47 logfile.6525
10412 2019-09-03 00:47 logfile.6526
6913 2019-09-03 00:47 logrep_rowfilter_2_nodes.sh
3371 2019-09-03 00:47 output.txt
--------- -------
38638 4 files
That bash script runs 2 instances (as compiled on my local setup so it
will not run as-is) and tries for one minute to get a slice of the
pgbench_accounts table replicated. One minute is short but I wanted
short logfiles; I have tried the same up to 20 minutes without the
replication completing. I'll try even longer but in the meantime I hope
you can figure out why these errors occur.
thanks,
Erik Rijkers
Attachments:
There are complaints in the log (both pub and sub) like:
ERROR: trying to store a heap tuple into wrong type of slotI have no idea what causes that.
Yeah, I've seen that too. It was fixed by Alexey Kondratov, in line 955 of 0005-Row-filtering-for-logical-replication.patch it should be &TTSOpsHeapTuple instead of &TTSOpsVirtual.
Em ter, 3 de set de 2019 às 00:16, Alexey Zagarin <zagarin@gmail.com> escreveu:
There are complaints in the log (both pub and sub) like:
ERROR: trying to store a heap tuple into wrong type of slotI have no idea what causes that.
Yeah, I've seen that too. It was fixed by Alexey Kondratov, in line 955 of 0005-Row-filtering-for-logical-replication.patch it should be &TTSOpsHeapTuple instead of &TTSOpsVirtual.
Ops... exact. That was an oversight while poking with different types of slots.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On 2019-09-03 05:32, Euler Taveira wrote:
Em ter, 3 de set de 2019 às 00:16, Alexey Zagarin <zagarin@gmail.com>
escreveu:There are complaints in the log (both pub and sub) like:
ERROR: trying to store a heap tuple into wrong type of slotI have no idea what causes that.
Yeah, I've seen that too. It was fixed by Alexey Kondratov, in line
955 of 0005-Row-filtering-for-logical-replication.patch it should be
&TTSOpsHeapTuple instead of &TTSOpsVirtual.Ops... exact. That was an oversight while poking with different types
of slots.
OK, I'll consider Alexey Kondratov's set of patches as the current
state-of-the-art then. (They still apply.)
I found a problem where I'm not sure it's a bug:
The attached bash script does a test by setting up pgbench tables on
both master and replica, and then sets up logical replication for a
slice of pgbench_accounts. Then it does a short pgbench run, and loops
until the results become identical(ok) (or breaks out after a certain
time (NOK=not ok)).
It turns out this did not work until I added a wait state after the
CREATE SUBSCRIPTION. It always fails without the wait state, and always
works with the wait state.
Do you agree this is a bug?
thanks (also to both Alexeys :))
Erik Rijkers
PS
by the way, this script won't run as-is on other machines; it has stuff
particular to my local setup.
Attachments:
OK, I'll consider Alexey Kondratov's set of patches as the current
state-of-the-art then. (They still apply.)
Alexey's patch is the rebased version of previous Euler's patch set, with slot type mistake fixed, and adapted to current changes in the master branch. It also has testing improvements. On the other hand, the new patches from Euler include more fixes and the implementation of filtering in COPY (as far as I can tell from code) which addresses my particular pain point with BDR. Hope they'll be joined soon. :)
It turns out this did not work until I added a wait state after the
CREATE SUBSCRIPTION. It always fails without the wait state, and always
works with the wait state.Do you agree this is a bug?
I'm not sure this is a bug as after the subscription is added (or a new table added to the publication and then the subscription is refreshed), the whole table is synchronized using COPY statement. Depending on size of the table it can take some time. You may want to check srsubstate in pg_subscription_rel instead of just sleep for more reliable implementation.
Alexey
Em ter, 3 de set de 2019 às 00:32, Euler Taveira
<euler@timbira.com.br> escreveu:
Ops... exact. That was an oversight while poking with different types of slots.
Here is a rebased version including this small fix.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachments:
0008-Debug-for-row-filtering.patchtext/x-patch; charset=US-ASCII; name=0008-Debug-for-row-filtering.patchDownload
From eac82a07d5a0b7d0c71490469f6ca473950e0333 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 14 Mar 2018 00:53:17 +0000
Subject: [PATCH 8/8] Debug for row filtering
---
src/backend/commands/publicationcmds.c | 11 +++++
src/backend/replication/logical/tablesync.c | 1 +
src/backend/replication/pgoutput/pgoutput.c | 66 +++++++++++++++++++++++++++++
3 files changed, 78 insertions(+)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 716ed2ec58..d0406d14d7 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -341,6 +341,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ ListCell *lc;
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -352,6 +353,16 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause == NULL)
+ elog(DEBUG3, "publication \"%s\" has no WHERE clause", NameStr(pubform->pubname));
+ else
+ elog(DEBUG3, "publication \"%s\" has WHERE clause", NameStr(pubform->pubname));
+ }
+
/*
* ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
* publication_table_list node (that accepts a WHERE clause) but forbid the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 5468b694f6..1fc7d5647b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -865,6 +865,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
}
+ elog(DEBUG2, "COPY for initial synchronization: %s", cmd.data);
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 89a80a8abc..f94dab88b6 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -34,6 +34,7 @@
#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -323,6 +324,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
MemoryContext old;
RelationSyncEntry *relentry;
+ Form_pg_class class_form;
+ char *schemaname;
+ char *tablename;
+
if (!is_publishable_relation(relation))
return;
@@ -347,6 +352,17 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ class_form = RelationGetForm(relation);
+ schemaname = get_namespace_name(class_form->relnamespace);
+ tablename = NameStr(class_form->relname);
+
+ if (change->action == REORDER_BUFFER_CHANGE_INSERT)
+ elog(DEBUG1, "INSERT \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_UPDATE)
+ elog(DEBUG1, "UPDATE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_DELETE)
+ elog(DEBUG1, "DELETE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+
/* ... then check row filter */
if (list_length(relentry->qual) > 0)
{
@@ -364,6 +380,42 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
+#ifdef _NOT_USED
+ if (old_tuple)
+ {
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute attr;
+ HeapTuple type_tuple;
+ Oid typoutput;
+ bool typisvarlena;
+ bool isnull;
+ Datum val;
+ char *outputstr = NULL;
+
+ attr = TupleDescAttr(tupdesc, i);
+
+ /* Figure out type name */
+ type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
+ if (HeapTupleIsValid(type_tuple))
+ {
+ /* Get information needed for printing values of a type */
+ getTypeOutputInfo(attr->atttypid, &typoutput, &typisvarlena);
+
+ val = heap_getattr(old_tuple, i + 1, tupdesc, &isnull);
+ if (!isnull)
+ {
+ outputstr = OidOutputFunctionCall(typoutput, val);
+ elog(DEBUG2, "row filter: REPLICA IDENTITY %s: %s", NameStr(attr->attname), outputstr);
+ pfree(outputstr);
+ }
+ }
+ }
+ }
+#endif
+
/* prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -379,6 +431,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Oid expr_type;
Datum res;
bool isnull;
+ char *s = NULL;
qual = (Node *) lfirst(lc);
@@ -389,12 +442,22 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
expr_state = ExecInitExpr(expr, NULL);
res = ExecEvalExpr(expr_state, ecxt, &isnull);
+ elog(DEBUG3, "row filter: result: %s ; isnull: %s", (DatumGetBool(res)) ? "true" : "false", (isnull) ? "true" : "false");
+
/* if tuple does not match row filter, bail out */
if (!DatumGetBool(res) || isnull)
{
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was not matched", s);
+ pfree(s);
+
matched = false;
break;
}
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was matched", s);
+ pfree(s);
}
MemoryContextSwitchTo(oldcxt);
@@ -668,9 +731,12 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
char *s = TextDatumGetCString(rf_datum);
+ char *t = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, rf_datum, ObjectIdGetDatum(entry->relid)));
Node *rf_node = stringToNode(s);
entry->qual = lappend(entry->qual, rf_node);
MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", t, pub->name, get_rel_name(relid));
}
ReleaseSysCache(rf_tuple);
--
2.11.0
0002-Store-number-of-tuples-in-WalRcvExecResult.patchtext/x-patch; charset=US-ASCII; name=0002-Store-number-of-tuples-in-WalRcvExecResult.patchDownload
From 92474dd8e15d58e253d5b5aa76348d8973bf6d04 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 17:37:36 +0000
Subject: [PATCH 2/8] Store number of tuples in WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 5 +++--
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..343550a335 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -878,6 +878,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -888,7 +889,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -897,7 +898,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
char *cstrs[MaxTupleAttributeNumber];
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 0a565dd837..42db4ada9e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -709,9 +709,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e12a934966..0d32d598d8 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -196,6 +196,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* libpqwalreceiver hooks */
--
2.11.0
0003-Refactor-function-create_estate_for_relation.patchtext/x-patch; charset=US-ASCII; name=0003-Refactor-function-create_estate_for_relation.patchDownload
From b8a8d98368ba032670788ac4f4b4ef666a4dedd0 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH 3/8] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 11e6331f49..d9952c8b7e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -173,7 +173,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -183,13 +183,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -589,7 +589,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -696,7 +696,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -815,7 +815,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.11.0
0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchDownload
From 2b398d46e12b0a9c5cf134585597991e4a2d43dc Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 18:39:22 +0000
Subject: [PATCH 1/8] Remove unused atttypmod column from initial table
synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..0a565dd837 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -647,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
- Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -691,7 +691,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -703,7 +702,7 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->remoteid,
(walrcv_server_version(wrconn) >= 120000 ? "AND a.attgenerated = ''" : ""),
lrel->remoteid);
- res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+ res = walrcv_exec(wrconn, cmd.data, 3, attrRow);
if (res->status != WALRCV_OK_TUPLES)
ereport(ERROR,
@@ -724,7 +723,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.11.0
0004-Rename-a-WHERE-node.patchtext/x-patch; charset=US-ASCII; name=0004-Rename-a-WHERE-node.patchDownload
From 32ba5ccea3e329044c14f2b3a82de463a573cb63 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c97bb367f8..1de8f56794 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -476,7 +476,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3710,7 +3710,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3812,7 +3812,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.11.0
0005-Row-filtering-for-logical-replication.patchtext/x-patch; charset=US-ASCII; name=0005-Row-filtering-for-logical-replication.patchDownload
From 7ccf083418bb07b130963b711d4b48e138924731 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH 5/8] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/catalogs.sgml | 9 +++
doc/src/sgml/ref/alter_publication.sgml | 11 ++-
doc/src/sgml/ref/create_publication.sgml | 26 +++++-
src/backend/catalog/pg_publication.c | 102 ++++++++++++++++++++++--
src/backend/commands/publicationcmds.c | 93 +++++++++++++++-------
src/backend/parser/gram.y | 26 ++++--
src/backend/parser/parse_agg.c | 10 +++
src/backend/parser/parse_expr.c | 14 +++-
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 119 +++++++++++++++++++++++++---
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 100 ++++++++++++++++++++++-
src/include/catalog/pg_publication.h | 9 ++-
src/include/catalog/pg_publication_rel.h | 10 ++-
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 ++-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 29 +++++++
src/test/regress/sql/publication.sql | 21 +++++
src/test/subscription/t/013_row_filter.pl | 96 ++++++++++++++++++++++
22 files changed, 629 insertions(+), 67 deletions(-)
create mode 100644 src/test/subscription/t/013_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 5e71a2e865..7f11225f65 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5595,6 +5595,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.oid</literal></entry>
<entry>Reference to relation</entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
+ qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598d93..9608448207 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..6e99943374 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">expression</replaceable> will not be
+ published. Note that parentheses are required around the expression.
</para>
<para>
@@ -157,6 +160,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -171,6 +181,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -184,6 +199,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d5f7..c873419a9e 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,10 @@
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -149,18 +153,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -180,10 +187,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -197,6 +221,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -213,11 +243,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -292,6 +328,62 @@ GetPublicationRelations(Oid pubid)
}
/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = heap_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ PublicationRelationQual *relqual;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
+ }
+ else
+ qual_expr = NULL;
+
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(pubrel->prrelid, ShareUpdateExclusiveLock);
+ relqual->whereClause = copyObject(qual_expr);
+ result = lappend(result, relqual);
+ }
+
+ systable_endscan(scan);
+ heap_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
+/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
List *
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4bf80..716ed2ec58 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -352,6 +352,27 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid the
+ * WHERE clause in it. The use of relation_expr_list node just for the
+ * DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -360,47 +381,55 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
PublicationDropTables(pubid, rels, false);
else /* DEFELEM_SET */
{
- List *oldrelids = GetPublicationRelations(pubid);
+ List *oldrels = GetPublicationRelationQuals(pubid);
List *delrels = NIL;
ListCell *oldlc;
/* Calculate which relations to drop. */
- foreach(oldlc, oldrelids)
+ foreach(oldlc, oldrels)
{
- Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelationQual *oldrel = lfirst(oldlc);
+ PublicationRelationQual *newrel;
ListCell *newlc;
bool found = false;
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == RelationGetRelid(oldrel->relation))
{
found = true;
break;
}
}
- if (!found)
+ /*
+ * Remove publication / relation mapping iif (i) table is not found in
+ * the new list or (ii) table is found in the new list, however,
+ * its qual does not match the old one (in this case, a simple
+ * tuple update is not enough because of the dependencies).
+ */
+ if (!found || (found && !equal(oldrel->whereClause, newrel->whereClause)))
{
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
+ PublicationRelationQual *oldrelqual = palloc(sizeof(PublicationRelationQual));
+ oldrelqual->relation = table_open(RelationGetRelid(oldrel->relation),
+ ShareUpdateExclusiveLock);
- delrels = lappend(delrels, oldrel);
+ delrels = lappend(delrels, oldrelqual);
}
}
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
+ CloseTableList(oldrels);
+ CloseTableList(delrels);
/*
* Don't bother calculating the difference for adding, we'll catch and
* skip existing ones when doing catalog update.
*/
PublicationAddTables(pubid, rels, true, stmt);
-
- CloseTableList(delrels);
}
CloseTableList(rels);
@@ -510,16 +539,18 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
- Relation rel;
- Oid myrelid;
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
+ bool recurse = rv->inh;
+ Relation rel;
+ Oid myrelid;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -539,8 +570,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/* Add children of this rel, if requested */
@@ -568,7 +601,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -589,10 +626,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -608,13 +647,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -640,8 +679,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -654,7 +693,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 1de8f56794..bd87e80e1b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -404,13 +404,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9518,7 +9518,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9549,7 +9549,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9557,7 +9557,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9565,7 +9565,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9575,6 +9575,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index f418c61545..dea5aadca7 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 76f3dd7076..6d2c6a28ea 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -170,6 +170,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in WHERE"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -571,6 +578,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1924,13 +1932,15 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
- break;
case EXPR_KIND_COPY_WHERE:
err = _("cannot use subquery in COPY FROM WHERE condition");
break;
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3561,6 +3571,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 8e926539e6..66458d8a48 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2516,6 +2516,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 42db4ada9e..5468b694f6 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -637,19 +637,26 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -713,20 +720,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -734,7 +741,46 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd, "SELECT pg_get_expr(prqual, prrelid) FROM pg_publication p INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) WHERE pr.prrelid = %u AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
walrcv_clear_result(res);
pfree(cmd.data);
@@ -750,6 +796,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyState cstate;
@@ -758,7 +805,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -767,10 +814,57 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- appendStringInfo(&cmd, "COPY %s TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /*
+ * If publication has any row filter, build a SELECT query with OR'ed row
+ * filters for COPY. If no row filters are available, use COPY for all
+ * table contents.
+ */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first;
+
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
+ appendStringInfo(&cmd, " FROM %s",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfoString(&cmd, " WHERE ");
+ /* list of OR'ed filters */
+ first = true;
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
+ list_free_deep(qual);
+ }
+ else
+ {
+ appendStringInfo(&cmd, "COPY %s TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ }
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -785,7 +879,6 @@ copy_table(Relation rel)
addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d9952c8b7e..cef0c52955 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -172,7 +172,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..89a80a8abc 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,15 +12,26 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "fmgr.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
#include "utils/memutils.h"
@@ -60,6 +71,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -335,6 +347,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach (lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+
+ if (!matched)
+ return;
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -570,10 +641,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
if (pub->alltables || list_member_oid(pubids, pub->oid))
{
@@ -583,9 +658,23 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prqual, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+ entry->qual = lappend(entry->qual, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -660,5 +749,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..ae8d2845a0 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -78,15 +78,22 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelationQuals(Oid pubid);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92ab3..a75b2d5345 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -28,9 +28,13 @@
*/
CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
{
- Oid oid; /* oid */
- Oid prpubid; /* Oid of the publication */
- Oid prrelid; /* Oid of the relation */
+ Oid oid; /* oid */
+ Oid prpubid; /* Oid of the publication */
+ Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index cc5dfed0bf..d57ca82e89 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -66,6 +66,7 @@ DECLARE_TOAST(pg_namespace, 4163, 4164);
DECLARE_TOAST(pg_partitioned_table, 4165, 4166);
DECLARE_TOAST(pg_policy, 4167, 4168);
DECLARE_TOAST(pg_proc, 2836, 2837);
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
DECLARE_TOAST(pg_statistic, 2840, 2841);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 3cbb08df92..7f83da1ee8 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -476,6 +476,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 94ded3c135..359f773092 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3461,12 +3461,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3479,7 +3486,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 7c099e7084..c2e8b9fcb9 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -73,6 +73,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 2642a3f94e..5cc307ee0e 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..202173c376 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,35 @@ Tables:
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in WHERE
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f | t | t | t | t
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..6f0d088984 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,27 @@ RESET client_min_messages;
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/013_row_filter.pl b/src/test/subscription/t/013_row_filter.pl
new file mode 100644
index 0000000000..99e6db94d6
--- /dev/null
+++ b/src/test/subscription/t/013_row_filter.pl
@@ -0,0 +1,96 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+
+my $result = $node_publisher->psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 DROP TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+is($result, 3, "syntax error for ALTER PUBLICATION DROP TABLE");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3");
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)");
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 10)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+#$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter_1");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(7|2|10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.11.0
0006-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; charset=US-ASCII; name=0006-Print-publication-WHERE-condition-in-psql.patchDownload
From c34e95f676a47219b12e06868b1304fbd91f4b3d Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Thu, 17 May 2018 20:52:28 +0000
Subject: [PATCH 6/8] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 774cc764ff..819b74bf4c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5875,7 +5875,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5905,6 +5906,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.11.0
0007-Publication-where-condition-support-for-pg_dump.patchtext/x-patch; charset=US-ASCII; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From 537e5b46211fd6b9987b47752a3b674e887e3157 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Sat, 15 Sep 2018 02:52:00 +0000
Subject: [PATCH 7/8] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7ec0c84540..69153fa28d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3959,6 +3959,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -3991,7 +3992,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prqual, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -4012,6 +4014,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4027,6 +4030,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4055,8 +4063,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index dfba58ac58..3ed2e1be9c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -608,6 +608,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.11.0
Hello
I find several problems as below when I test the patches:
1. There be some regression problem after apply 0001.patch~0005.patch
The regression problem is solved in 0006.patch
2. There be a data wrong after create subscription if the relation contains
inherits table, for example:
##########################
The Tables:
CREATE TABLE cities (
name text,
population float,
altitude int
);
CREATE TABLE capitals (
state char(2)
) INHERITS (cities);
Do on publication:
insert into cities values('aaa',123, 134);
insert into capitals values('bbb',123, 134);
create publication pub_tc for table cities where (altitude > 100 and altitude < 200);
postgres=# select * from cities ;
name | population | altitude
------+------------+----------
aaa | 123 | 134
bbb | 123 | 134
(2 rows)
Do on subscription:
create subscription sub_tc connection 'host=localhost port=5432 dbname=postgres' publication pub_tc;
postgres=# select * from cities ;
name | population | altitude
------+------------+----------
aaa | 123 | 134
bbb | 123 | 134
bbb | 123 | 134
(3 rows)
##########################
An unexcept row appears.
3. I am puzzled when I test the update.
Use the tables in problem 2 and test as below:
#########################
On publication:
postgres=# insert into cities values('t1',123, 34);
INSERT 0 1
postgres=# update cities SET altitude = 134 where altitude = 34;
UPDATE 1
postgres=# select * from cities ;
name | population | altitude
------+------------+----------
t1 | 123 | 134
(1 row)
On subscription:
postgres=# select * from cities ;
name | population | altitude
------+------------+----------
(0 rows)
On publication:
insert into cities values('t1',1,'135');
update cities set altitude=300 where altitude=135;
postgres=# table cities ;
name | population | altitude
------+------------+----------
t1 | 123 | 134
t1 | 1 | 300
(2 rows)
On subscription:
ostgres=# table cities ;
name | population | altitude
------+------------+----------
t1 | 1 | 135
(1 row)
#########################
Result1:Update a row that is not suitable the publication condition to
suitable, the subscription change nothing.
Result2: Update a row that is suitable for the publication condition to
not suitable, the subscription change nothing.
If it is a bug? Or there should be an explanation about it?
4. SQL splicing code in fetch_remote_table_info() function is too long
---
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead.li@highgo.ca
The new status of this patch is: Waiting on Author
Em seg, 23 de set de 2019 às 01:59, movead li <movead.li@highgo.ca> escreveu:
I find several problems as below when I test the patches:
First of all, thanks for your review.
1. There be some regression problem after apply 0001.patch~0005.patch
The regression problem is solved in 0006.patch
Which regression?
2. There be a data wrong after create subscription if the relation contains
inherits table, for example:
Ouch. Good catch! Forgot about the ONLY in COPY with query. I will add
a test for it.
3. I am puzzled when I test the update.
Use the tables in problem 2 and test as below:
#########################
On publication:
postgres=# insert into cities values('t1',123, 34);
INSERT 0 1
INSERT isn't replicated.
postgres=# update cities SET altitude = 134 where altitude = 34;
UPDATE 1
There should be an error because you don't have a PK or REPLICA IDENTITY.
postgres=# update cities SET altitude = 134 where altitude = 34;
ERROR: cannot update table "cities" because it does not have a
replica identity and publishes updates
HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
Even if you create a PK or REPLICA IDENTITY, it won't turn this UPDATE
into a INSERT and send it to the other node (indeed UPDATE will be
sent however there isn't a tuple to update). Also, filter columns must
be in PK or REPLICA IDENTITY. I explain this in documentation.
4. SQL splicing code in fetch_remote_table_info() function is too long
I split it into small pieces. I also run pgindent to improve code style.
I'll send a patchset later today.
Regards,
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Em qua, 25 de set de 2019 às 08:08, Euler Taveira
<euler@timbira.com.br> escreveu:
I'll send a patchset later today.
... and it is attached.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachments:
0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchDownload
From b5d4d1369dbb4e7ec20182507dc5ae920dd8d2e9 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 18:39:22 +0000
Subject: [PATCH 1/8] Remove unused atttypmod column from initial table
synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079..0a565dd 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -647,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
- Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -691,7 +691,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -703,7 +702,7 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->remoteid,
(walrcv_server_version(wrconn) >= 120000 ? "AND a.attgenerated = ''" : ""),
lrel->remoteid);
- res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+ res = walrcv_exec(wrconn, cmd.data, 3, attrRow);
if (res->status != WALRCV_OK_TUPLES)
ereport(ERROR,
@@ -724,7 +723,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.7.4
0002-Store-number-of-tuples-in-WalRcvExecResult.patchtext/x-patch; charset=US-ASCII; name=0002-Store-number-of-tuples-in-WalRcvExecResult.patchDownload
From 406b2dbe4df63a94364e548a67d085e255ea2644 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 17:37:36 +0000
Subject: [PATCH 2/8] Store number of tuples in WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 5 +++--
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a..343550a 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -878,6 +878,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -888,7 +889,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -897,7 +898,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
char *cstrs[MaxTupleAttributeNumber];
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 0a565dd..42db4ad 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -709,9 +709,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e12a934..0d32d59 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -196,6 +196,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* libpqwalreceiver hooks */
--
2.7.4
0004-Rename-a-WHERE-node.patchtext/x-patch; charset=US-ASCII; name=0004-Rename-a-WHERE-node.patchDownload
From 428c6f3959b67627e9f1c92fdf38d71bb66163ef Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3f67aaf..21bef5c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -476,7 +476,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3711,7 +3711,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3813,7 +3813,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.7.4
0003-Refactor-function-create_estate_for_relation.patchtext/x-patch; charset=US-ASCII; name=0003-Refactor-function-create_estate_for_relation.patchDownload
From 45231f2c46b61aabb0fcb4f938589a6c21aad2c5 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH 3/8] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 11e6331..d9952c8 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -173,7 +173,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -183,13 +183,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -589,7 +589,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -696,7 +696,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -815,7 +815,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.7.4
0005-Row-filtering-for-logical-replication.patchtext/x-patch; charset=US-ASCII; name=0005-Row-filtering-for-logical-replication.patchDownload
From e14f1892427778b728c0a684e68d89ee172a4679 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH 5/8] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/catalogs.sgml | 9 ++
doc/src/sgml/ref/alter_publication.sgml | 11 ++-
doc/src/sgml/ref/create_publication.sgml | 26 +++++-
src/backend/catalog/pg_publication.c | 102 ++++++++++++++++++++--
src/backend/commands/publicationcmds.c | 89 +++++++++++++------
src/backend/parser/gram.y | 26 ++++--
src/backend/parser/parse_agg.c | 10 +++
src/backend/parser/parse_expr.c | 14 ++-
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 127 +++++++++++++++++++++++++---
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 101 +++++++++++++++++++++-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 4 +
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 ++-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 29 +++++++
src/test/regress/sql/publication.sql | 21 +++++
src/test/subscription/t/013_row_filter.pl | 96 +++++++++++++++++++++
22 files changed, 634 insertions(+), 61 deletions(-)
create mode 100644 src/test/subscription/t/013_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 5e71a2e..7f11225 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5595,6 +5595,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.oid</literal></entry>
<entry>Reference to relation</entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
+ qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598..9608448 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca..6e99943 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">expression</replaceable> will not be
+ published. Note that parentheses are required around the expression.
</para>
<para>
@@ -157,6 +160,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -171,6 +181,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -184,6 +199,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d..f5462dc 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,10 @@
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -149,18 +153,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -180,10 +187,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -197,6 +221,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -213,11 +243,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -292,6 +328,62 @@ GetPublicationRelations(Oid pubid)
}
/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = heap_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ PublicationRelationQual *relqual;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
+ }
+ else
+ qual_expr = NULL;
+
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(pubrel->prrelid, ShareUpdateExclusiveLock);
+ relqual->whereClause = copyObject(qual_expr);
+ result = lappend(result, relqual);
+ }
+
+ systable_endscan(scan);
+ heap_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
+/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
List *
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4b..2606377 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -352,6 +352,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid
+ * the WHERE clause in it. The use of relation_expr_list node just for
+ * the DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -360,47 +382,56 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
PublicationDropTables(pubid, rels, false);
else /* DEFELEM_SET */
{
- List *oldrelids = GetPublicationRelations(pubid);
+ List *oldrels = GetPublicationRelationQuals(pubid);
List *delrels = NIL;
ListCell *oldlc;
/* Calculate which relations to drop. */
- foreach(oldlc, oldrelids)
+ foreach(oldlc, oldrels)
{
- Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelationQual *oldrel = lfirst(oldlc);
+ PublicationRelationQual *newrel;
ListCell *newlc;
bool found = false;
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == RelationGetRelid(oldrel->relation))
{
found = true;
break;
}
}
- if (!found)
+ /*
+ * Remove publication / relation mapping iif (i) table is not
+ * found in the new list or (ii) table is found in the new list,
+ * however, its qual does not match the old one (in this case, a
+ * simple tuple update is not enough because of the dependencies).
+ */
+ if (!found || (found && !equal(oldrel->whereClause, newrel->whereClause)))
{
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
+ PublicationRelationQual *oldrelqual = palloc(sizeof(PublicationRelationQual));
- delrels = lappend(delrels, oldrel);
+ oldrelqual->relation = table_open(RelationGetRelid(oldrel->relation),
+ ShareUpdateExclusiveLock);
+
+ delrels = lappend(delrels, oldrelqual);
}
}
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
+ CloseTableList(oldrels);
+ CloseTableList(delrels);
/*
* Don't bother calculating the difference for adding, we'll catch and
* skip existing ones when doing catalog update.
*/
PublicationAddTables(pubid, rels, true, stmt);
-
- CloseTableList(delrels);
}
CloseTableList(rels);
@@ -510,13 +541,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -539,8 +572,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/* Add children of this rel, if requested */
@@ -568,7 +603,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -589,10 +628,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -608,13 +649,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -640,8 +681,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -654,7 +695,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 21bef5c..8cad2bc 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -404,13 +404,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9547,7 +9547,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9578,7 +9578,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9586,7 +9586,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9594,7 +9594,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9604,6 +9604,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index f418c61..dea5aad 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 76f3dd7..6d2c6a2 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -170,6 +170,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in WHERE"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -571,6 +578,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1924,13 +1932,15 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
- break;
case EXPR_KIND_COPY_WHERE:
err = _("cannot use subquery in COPY FROM WHERE condition");
break;
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3561,6 +3571,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 8e92653..66458d8 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2516,6 +2516,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 42db4ad..d3999b1 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -637,19 +637,26 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -713,20 +720,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -734,7 +741,52 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
walrcv_clear_result(res);
pfree(cmd.data);
@@ -750,6 +802,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyState cstate;
@@ -758,7 +811,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -767,10 +820,59 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- appendStringInfo(&cmd, "COPY %s TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+
+ /*
+ * If publication has any row filter, build a SELECT query with OR'ed row
+ * filters for COPY. If no row filters are available, use COPY for all
+ * table contents.
+ */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first;
+
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
+ appendStringInfo(&cmd, " FROM ONLY %s",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfoString(&cmd, " WHERE ");
+ /* list of OR'ed filters */
+ first = true;
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
+ list_free_deep(qual);
+ }
+ else
+ {
+ appendStringInfo(&cmd, "COPY %s TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ }
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -785,7 +887,6 @@ copy_table(Relation rel)
addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d9952c8..cef0c52 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -172,7 +172,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757..63596e2 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,15 +12,26 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "fmgr.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
#include "utils/memutils.h"
@@ -60,6 +71,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -335,6 +347,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach(lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+
+ if (!matched)
+ return;
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -570,10 +641,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
if (pub->alltables || list_member_oid(pubids, pub->oid))
{
@@ -583,9 +658,24 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prqual, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+
+ entry->qual = lappend(entry->qual, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -660,5 +750,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0a..5261666 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -78,15 +78,22 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelationQuals(Oid pubid);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92..7fd5915 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid; /* Oid of the publication */
Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index cc5dfed..d57ca82 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -66,6 +66,7 @@ DECLARE_TOAST(pg_namespace, 4163, 4164);
DECLARE_TOAST(pg_partitioned_table, 4165, 4166);
DECLARE_TOAST(pg_policy, 4167, 4168);
DECLARE_TOAST(pg_proc, 2836, 2837);
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
DECLARE_TOAST(pg_statistic, 2840, 2841);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index bce2d59..52522d0 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -477,6 +477,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index d93a79a..ca9920c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3474,12 +3474,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3492,7 +3499,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 7c099e7..a5c9109 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -73,6 +73,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 2642a3f..5cc307e 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4..202173c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,35 @@ Tables:
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in WHERE
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f | t | t | t | t
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a75..6f0d088 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,27 @@ RESET client_min_messages;
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/013_row_filter.pl b/src/test/subscription/t/013_row_filter.pl
new file mode 100644
index 0000000..99e6db9
--- /dev/null
+++ b/src/test/subscription/t/013_row_filter.pl
@@ -0,0 +1,96 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+
+my $result = $node_publisher->psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 DROP TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+is($result, 3, "syntax error for ALTER PUBLICATION DROP TABLE");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3");
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)");
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 10)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+#$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter_1");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(7|2|10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.7.4
0006-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; charset=US-ASCII; name=0006-Print-publication-WHERE-condition-in-psql.patchDownload
From a5ca3aa97cd50b796f58fe60e8dbc9ed196aac9b Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Thu, 17 May 2018 20:52:28 +0000
Subject: [PATCH 6/8] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d7c0fc0..76404e0 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5876,7 +5876,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5906,6 +5907,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.7.4
0008-Debug-for-row-filtering.patchtext/x-patch; charset=US-ASCII; name=0008-Debug-for-row-filtering.patchDownload
From 90f046605051e79739372339ffeae982fd45e328 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 14 Mar 2018 00:53:17 +0000
Subject: [PATCH 8/8] Debug for row filtering
---
src/backend/commands/publicationcmds.c | 11 +++++
src/backend/replication/logical/tablesync.c | 1 +
src/backend/replication/pgoutput/pgoutput.c | 66 +++++++++++++++++++++++++++++
3 files changed, 78 insertions(+)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 2606377..b2378c6 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -341,6 +341,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ ListCell *lc;
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -352,6 +353,16 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause == NULL)
+ elog(DEBUG3, "publication \"%s\" has no WHERE clause", NameStr(pubform->pubname));
+ else
+ elog(DEBUG3, "publication \"%s\" has WHERE clause", NameStr(pubform->pubname));
+ }
+
/*
* ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
* publication_table_list node (that accepts a WHERE clause) but forbid
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index d3999b1..2f586c0 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -873,6 +873,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
}
+ elog(DEBUG2, "COPY for initial synchronization: %s", cmd.data);
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 63596e2..6306bc3 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -34,6 +34,7 @@
#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/int8.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -323,6 +324,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
MemoryContext old;
RelationSyncEntry *relentry;
+ Form_pg_class class_form;
+ char *schemaname;
+ char *tablename;
+
if (!is_publishable_relation(relation))
return;
@@ -347,6 +352,17 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ class_form = RelationGetForm(relation);
+ schemaname = get_namespace_name(class_form->relnamespace);
+ tablename = NameStr(class_form->relname);
+
+ if (change->action == REORDER_BUFFER_CHANGE_INSERT)
+ elog(DEBUG1, "INSERT \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_UPDATE)
+ elog(DEBUG1, "UPDATE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_DELETE)
+ elog(DEBUG1, "DELETE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+
/* ... then check row filter */
if (list_length(relentry->qual) > 0)
{
@@ -364,6 +380,42 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
+#ifdef _NOT_USED
+ if (old_tuple)
+ {
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute attr;
+ HeapTuple type_tuple;
+ Oid typoutput;
+ bool typisvarlena;
+ bool isnull;
+ Datum val;
+ char *outputstr = NULL;
+
+ attr = TupleDescAttr(tupdesc, i);
+
+ /* Figure out type name */
+ type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
+ if (HeapTupleIsValid(type_tuple))
+ {
+ /* Get information needed for printing values of a type */
+ getTypeOutputInfo(attr->atttypid, &typoutput, &typisvarlena);
+
+ val = heap_getattr(old_tuple, i + 1, tupdesc, &isnull);
+ if (!isnull)
+ {
+ outputstr = OidOutputFunctionCall(typoutput, val);
+ elog(DEBUG2, "row filter: REPLICA IDENTITY %s: %s", NameStr(attr->attname), outputstr);
+ pfree(outputstr);
+ }
+ }
+ }
+ }
+#endif
+
/* prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -379,6 +431,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Oid expr_type;
Datum res;
bool isnull;
+ char *s = NULL;
qual = (Node *) lfirst(lc);
@@ -389,12 +442,22 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
expr_state = ExecInitExpr(expr, NULL);
res = ExecEvalExpr(expr_state, ecxt, &isnull);
+ elog(DEBUG3, "row filter: result: %s ; isnull: %s", (DatumGetBool(res)) ? "true" : "false", (isnull) ? "true" : "false");
+
/* if tuple does not match row filter, bail out */
if (!DatumGetBool(res) || isnull)
{
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was not matched", s);
+ pfree(s);
+
matched = false;
break;
}
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was matched", s);
+ pfree(s);
}
MemoryContextSwitchTo(oldcxt);
@@ -668,10 +731,13 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
char *s = TextDatumGetCString(rf_datum);
+ char *t = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, rf_datum, ObjectIdGetDatum(entry->relid)));
Node *rf_node = stringToNode(s);
entry->qual = lappend(entry->qual, rf_node);
MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", t, pub->name, get_rel_name(relid));
}
ReleaseSysCache(rf_tuple);
--
2.7.4
0007-Publication-where-condition-support-for-pg_dump.patchtext/x-patch; charset=US-ASCII; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From 3897f998a328fbd42824fe265a15e76ec1247703 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Sat, 15 Sep 2018 02:52:00 +0000
Subject: [PATCH 7/8] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f01fea5..3c37134 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3959,6 +3959,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -3991,7 +3992,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prqual, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -4012,6 +4014,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4027,6 +4030,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4055,8 +4063,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ec5a924..8d61faa 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -609,6 +609,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.7.4
Which regression?
Apply the 0001.patch~0005.patch and then do a 'make check', then there be a
failed item. And when you apply the 0006.patch, the failed item disappeared.
There should be an error because you don't have a PK or REPLICA IDENTITY.
No. I have done the 'ALTER TABLE cities REPLICA IDENTITY FULL'.
Even if you create a PK or REPLICA IDENTITY, it won't turn this UPDATE
into a INSERT and send it to the other node (indeed UPDATE will be
sent however there isn't a tuple to update). Also, filter columns must
be in PK or REPLICA IDENTITY. I explain this in documentation.
You should considered the Result2:
On publication:
insert into cities values('t1',1,135);
update cities set altitude=300 where altitude=135;
postgres=# table cities ;
name | population | altitude
------+------------+----------
t1 | 123 | 134
t1 | 1 | 300
(2 rows)
On subscription:
ostgres=# table cities ;
name | population | altitude
------+------------+----------
t1 | 1 | 135
The tuple ('t1',1,135) appeared in both publication and subscription,
but after an update on publication, the tuple is disappeared on
publication and change nothing on subscription.
The same with Result1, they puzzled me today and I think they will
puzzle the users in the future. It should have a more wonderful design,
for example, a log to notify users that there be a problem during replication
at least.
---
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Hi Euler,
Thanks for working on this. I have reviewed the patches, as I too am
working on a patch related to logical replication [1]https://commitfest.postgresql.org/25/2301/.
On Thu, Sep 26, 2019 at 8:20 AM Euler Taveira <euler@timbira.com.br> wrote:
Em qua, 25 de set de 2019 às 08:08, Euler Taveira
<euler@timbira.com.br> escreveu:I'll send a patchset later today.
... and it is attached.
Needed to be rebased, which I did, to be able to test them; patches attached.
Some comments:
* 0001: seems a no-brainer
* 0002: seems, um, unnecessary? The only place ntuples will be used is here:
@@ -702,9 +702,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
but you might as well use tuplestore_tuple_count(res->tuplestore). My
point is that if ntuples that this patch is adding was widely useful
(as would be shown by the number of places that could be refactored to
use it), it would have been worthwhile to add it.
* 0003: seems fine to me.
* 0004: seems fine too, although maybe preproc.y should be updated too?
* 0005: naturally many comments here :)
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
Minor nitpicking: "in the form of a" seems unnecessary. Other places
that mention nodeToString() just say "in
<function>nodeToString()</function> representation"
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
Can you please explain the reasoning behind this restriction. Sorry
if this is already covered in the up-thread discussion.
/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
...
+ relqual->relation = table_open(pubrel->prrelid,
ShareUpdateExclusiveLock);
I think it's a bad idea to open the table in one file and rely on
something else in the other file closing it. I know you're having it
to do it because you're using PublicationRelationQual to return
individual tables, but why not just store the table's OID in it and
only open and close the relation where it's needed. Keeping the
opening and closing of relation close to each other is better as long
as it doesn't need to be done many times over in many different
functions. In this case, pg_publication.c: publication_add_relation()
is the only place that needs to look at the open relation, so opening
and closing should both be done there. Nothing else needs to look at
the open relation.
Actually, OpenTableList() should also not open the relation. Then we
don't need CloseTableList(). I think it would be better to refactor
things around this and include the patch in this series.
+ /* Find all publications associated with the relation. */
+ pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
I guess you meant:
/* Get all relations associated with this publication. */
+ relqual->whereClause = copyObject(qual_expr);
Is copying really necessary?
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid
+ * the WHERE clause in it. The use of relation_expr_list node just for
+ * the DROP TABLE part does not worth the trouble.
+ */
This comment is not very helpful, as it's not clear what the various
names are referring to. I'd just just write:
/*
* Although ALTER PUBLICATION's grammar allows WHERE clause to be
* specified for DROP TABLE action, it doesn't makes sense to allow it.
* We implement that rule here, instead of complicating grammar to enforce
* it.
*/
+ errmsg("cannot use a WHERE clause for
removing table from publication \"%s\"",
I think: s/for/when/g
+ /*
+ * Remove publication / relation mapping iif (i) table is not
+ * found in the new list or (ii) table is found in the new list,
+ * however, its qual does not match the old one (in this case, a
+ * simple tuple update is not enough because of the dependencies).
+ */
Aside from the typo on the 1st line (iif), I suggest writing this as:
/*-----------
* Remove the publication-table mapping if:
*
* 1) Table is not found the new list of tables
*
* 2) Table is being re-added with a different qual expression
*
* For (2), simply updating the existing tuple is not enough,
* because of the qual expression's dependencies.
*/
+ errmsg("functions are not allowed in WHERE"),
Maybe:
functions are now allowed in publication WHERE expressions
+ err = _("cannot use subquery in publication WHERE expression");
s/expression/expressions/g
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
Maybe:
publication WHERE expression
or
publication qual
- int natt;
+ int n;
Are this and other related changes really needed?
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
Hmm, why wouldn't SELECT * suffice?
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate,
tupdesc, &TTSOpsHeapTuple);
...
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
Creating and destroying the EState (that too with the ResultRelInfo
that is never used) for every tuple seems wasteful. You could store
the standalone ExprContext in RelationSyncEntry and use it for every
tuple.
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual,
expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST,
-1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
Also, there appears to be no need to repeat this for every tuple? I
think this should be done only once, that is, RelationSyncEntry.qual
should cache ExprState nodes, not bare Expr nodes.
Given the above comments, the following seems unnecessary:
+extern EState *create_estate_for_relation(Relation rel);
By the way, make check doesn't pass. I see the following failure:
- "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+ "public.testpub_rf_tbl3"
but I guess applying subsequent patches takes care of that.
* 0006 and 0007: small enough that I think it might be better to merge
them into 0005.
* 0008: no comments as it's not intended to be committed. :)
Thanks,
Amit
On Mon, Nov 25, 2019 at 11:38 AM Amit Langote <amitlangote09@gmail.com> wrote:
Needed to be rebased, which I did, to be able to test them; patches attached.
Oops, really attached this time.
Thanks,
Amit
Attachments:
0002-Store-number-of-tuples-in-WalRcvExecResult.patchapplication/octet-stream; name=0002-Store-number-of-tuples-in-WalRcvExecResult.patchDownload
From e6c50bfe35f96d29c8bc2ec3b57c92dd5376a46a Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 17:37:36 +0000
Subject: [PATCH 2/8] Store number of tuples in WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 5 +++--
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 545d2fcd05..1a32dbd2e6 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -878,6 +878,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -888,7 +889,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -897,7 +898,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
char *cstrs[MaxTupleAttributeNumber];
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index b437e093b1..205f2a4979 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -702,9 +702,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e12a934966..0d32d598d8 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -196,6 +196,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* libpqwalreceiver hooks */
--
2.11.0
0003-Refactor-function-create_estate_for_relation.patchapplication/octet-stream; name=0003-Refactor-function-create_estate_for_relation.patchDownload
From 8b926ed6f65eeda9d424a88987f984940fea7687 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 02:21:03 +0000
Subject: [PATCH 3/8] Refactor function create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ced0d191c2..e5e87f3c2f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -173,7 +173,7 @@ ensure_transaction(void)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
ResultRelInfo *resultRelInfo;
@@ -183,13 +183,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
resultRelInfo = makeNode(ResultRelInfo);
- InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
+ InitResultRelInfo(resultRelInfo, rel, 1, NULL, 0);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -605,7 +605,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -712,7 +712,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -831,7 +831,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.11.0
0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchapplication/octet-stream; name=0001-Remove-unused-atttypmod-column-from-initial-table-sy.patchDownload
From caf71cf951754bc711498b2dccfbd8289049b591 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Fri, 9 Mar 2018 18:39:22 +0000
Subject: [PATCH 1/8] Remove unused atttypmod column from initial table
synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e01d18c3a1..b437e093b1 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -640,7 +640,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
- Oid attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -684,7 +684,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -696,7 +695,7 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->remoteid,
(walrcv_server_version(wrconn) >= 120000 ? "AND a.attgenerated = ''" : ""),
lrel->remoteid);
- res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+ res = walrcv_exec(wrconn, cmd.data, 3, attrRow);
if (res->status != WALRCV_OK_TUPLES)
ereport(ERROR,
@@ -717,7 +716,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.11.0
0004-Rename-a-WHERE-node.patchapplication/octet-stream; name=0004-Rename-a-WHERE-node.patchDownload
From 1c5fc52027169f045a49481f01129c7a3f0d5822 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 24 Jan 2018 17:01:31 -0200
Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c5086846de..faf9b4bc80 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -478,7 +478,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3713,7 +3713,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3815,7 +3815,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.11.0
0005-Row-filtering-for-logical-replication.patchapplication/octet-stream; name=0005-Row-filtering-for-logical-replication.patchDownload
From fc7b6112b3d3f2a42837565e451c4e4807674297 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Tue, 27 Feb 2018 04:03:13 +0000
Subject: [PATCH 5/8] Row filtering for logical replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/catalogs.sgml | 9 ++
doc/src/sgml/ref/alter_publication.sgml | 11 ++-
doc/src/sgml/ref/create_publication.sgml | 26 +++++-
src/backend/catalog/pg_publication.c | 103 ++++++++++++++++++++--
src/backend/commands/publicationcmds.c | 89 +++++++++++++------
src/backend/parser/gram.y | 26 ++++--
src/backend/parser/parse_agg.c | 10 +++
src/backend/parser/parse_expr.c | 14 ++-
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 127 +++++++++++++++++++++++++---
src/backend/replication/logical/worker.c | 2 +-
src/backend/replication/pgoutput/pgoutput.c | 100 +++++++++++++++++++++-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 4 +
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 ++-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 29 +++++++
src/test/regress/sql/publication.sql | 21 +++++
src/test/subscription/t/013_row_filter.pl | 96 +++++++++++++++++++++
22 files changed, 634 insertions(+), 61 deletions(-)
create mode 100644 src/test/subscription/t/013_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 55694c4368..a16571fee7 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5595,6 +5595,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.oid</literal></entry>
<entry>Reference to relation</entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
+ qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598d93..9608448207 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..6e99943374 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
@@ -68,7 +68,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
that table is added to the publication. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are added.
Optionally, <literal>*</literal> can be specified after the table name to
- explicitly indicate that descendant tables are included.
+ explicitly indicate that descendant tables are included. If the optional
+ <literal>WHERE</literal> clause is specified, rows that do not satisfy
+ the <replaceable class="parameter">expression</replaceable> will not be
+ published. Note that parentheses are required around the expression.
</para>
<para>
@@ -157,6 +160,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -171,6 +181,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -184,6 +199,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d442c8e0bb..57f41f7e92 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -31,6 +31,11 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -146,18 +151,21 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -177,10 +185,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ rte = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -194,6 +219,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -210,11 +241,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -289,6 +326,62 @@ GetPublicationRelations(Oid pubid)
}
/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ PublicationRelationQual *relqual;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
+ }
+ else
+ qual_expr = NULL;
+
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(pubrel->prrelid, ShareUpdateExclusiveLock);
+ relqual->whereClause = copyObject(qual_expr);
+ result = lappend(result, relqual);
+ }
+
+ systable_endscan(scan);
+ table_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
+/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
List *
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index fbf11c86aa..f6a91a33f6 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -348,6 +348,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid
+ * the WHERE clause in it. The use of relation_expr_list node just for
+ * the DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -356,47 +378,56 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
PublicationDropTables(pubid, rels, false);
else /* DEFELEM_SET */
{
- List *oldrelids = GetPublicationRelations(pubid);
+ List *oldrels = GetPublicationRelationQuals(pubid);
List *delrels = NIL;
ListCell *oldlc;
/* Calculate which relations to drop. */
- foreach(oldlc, oldrelids)
+ foreach(oldlc, oldrels)
{
- Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelationQual *oldrel = lfirst(oldlc);
+ PublicationRelationQual *newrel;
ListCell *newlc;
bool found = false;
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ newrel = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrel->relation) == RelationGetRelid(oldrel->relation))
{
found = true;
break;
}
}
- if (!found)
+ /*
+ * Remove publication / relation mapping iif (i) table is not
+ * found in the new list or (ii) table is found in the new list,
+ * however, its qual does not match the old one (in this case, a
+ * simple tuple update is not enough because of the dependencies).
+ */
+ if (!found || (found && !equal(oldrel->whereClause, newrel->whereClause)))
{
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
+ PublicationRelationQual *oldrelqual = palloc(sizeof(PublicationRelationQual));
- delrels = lappend(delrels, oldrel);
+ oldrelqual->relation = table_open(RelationGetRelid(oldrel->relation),
+ ShareUpdateExclusiveLock);
+
+ delrels = lappend(delrels, oldrelqual);
}
}
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
+ CloseTableList(oldrels);
+ CloseTableList(delrels);
/*
* Don't bother calculating the difference for adding, we'll catch and
* skip existing ones when doing catalog update.
*/
PublicationAddTables(pubid, rels, true, stmt);
-
- CloseTableList(delrels);
}
CloseTableList(rels);
@@ -506,13 +537,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -535,8 +568,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/* Add children of this rel, if requested */
@@ -564,7 +599,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -585,10 +624,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -604,13 +645,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -636,8 +677,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -650,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index faf9b4bc80..ba19e9e13f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -405,14 +405,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
- publication_name_list
+ publication_name_list publication_table_list
vacuum_relation_list opt_vacuum_relation_list
drop_option_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <value> publication_name_item
%type <list> opt_fdw_options fdw_options
@@ -9571,7 +9571,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9602,7 +9602,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9610,7 +9610,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9618,7 +9618,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9628,6 +9628,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index f418c61545..dea5aadca7 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index eb91da2d87..bab9f44c61 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -169,6 +169,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in WHERE"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -570,6 +577,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1923,13 +1931,15 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
break;
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
- break;
case EXPR_KIND_COPY_WHERE:
err = _("cannot use subquery in COPY FROM WHERE condition");
break;
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3560,6 +3570,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index d9c6dc1901..bd47244a73 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2517,6 +2517,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 205f2a4979..5d58d1cd66 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,19 +630,26 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[2] = {OIDOID, CHAROID};
Oid attrRow[3] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -706,20 +713,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -727,7 +734,52 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
walrcv_clear_result(res);
pfree(cmd.data);
@@ -743,6 +795,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyState cstate;
@@ -751,7 +804,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -760,10 +813,59 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- appendStringInfo(&cmd, "COPY %s TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+
+ /*
+ * If publication has any row filter, build a SELECT query with OR'ed row
+ * filters for COPY. If no row filters are available, use COPY for all
+ * table contents.
+ */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first;
+
+ appendStringInfoString(&cmd, "COPY (SELECT ");
+ /* list of attribute names */
+ first = true;
+ foreach(lc, attnamelist)
+ {
+ char *col = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+ appendStringInfo(&cmd, "%s", quote_identifier(col));
+ }
+ appendStringInfo(&cmd, " FROM ONLY %s",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfoString(&cmd, " WHERE ");
+ /* list of OR'ed filters */
+ first = true;
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
+ list_free_deep(qual);
+ }
+ else
+ {
+ appendStringInfo(&cmd, "COPY %s TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname));
+ }
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -778,7 +880,6 @@ copy_table(Relation rel)
addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e5e87f3c2f..68c7e1a679 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -172,7 +172,7 @@ ensure_transaction(void)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3483c1b877..9500c5e52a 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,12 +12,22 @@
*/
#include "postgres.h"
+#include "catalog/pg_type.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/memutils.h"
@@ -57,6 +67,7 @@ typedef struct RelationSyncEntry
bool schema_sent; /* did we send the schema? */
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
} RelationSyncEntry;
/* Map used to remember which relation schemas we sent. */
@@ -332,6 +343,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach(lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+
+ if (!matched)
+ return;
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -567,10 +637,14 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
*/
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
if (pub->alltables || list_member_oid(pubids, pub->oid))
{
@@ -580,9 +654,24 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prqual, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+
+ entry->qual = lappend(entry->qual, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -657,5 +746,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..5261666a8b 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -78,15 +78,22 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelationQuals(Oid pubid);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(void);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92ab3..7fd5915200 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid; /* Oid of the publication */
Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index cc5dfed0bf..d57ca82e89 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -66,6 +66,7 @@ DECLARE_TOAST(pg_namespace, 4163, 4164);
DECLARE_TOAST(pg_partitioned_table, 4165, 4166);
DECLARE_TOAST(pg_policy, 4167, 4168);
DECLARE_TOAST(pg_proc, 2836, 2837);
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
DECLARE_TOAST(pg_statistic, 2840, 2841);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index bce2d59b0d..52522d0fbd 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -477,6 +477,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index ff626cbe61..fcdecb1a24 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3475,12 +3475,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3493,7 +3500,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 7c099e7084..a5c9109acf 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -73,6 +73,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 2642a3f94e..5cc307ee0e 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -39,4 +39,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..202173c376 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,35 @@ Tables:
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in WHERE
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f | t | t | t | t
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..6f0d088984 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,27 @@ RESET client_min_messages;
DROP TABLE testpub_tbl3, testpub_tbl3a;
DROP PUBLICATION testpub3, testpub4;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/013_row_filter.pl b/src/test/subscription/t/013_row_filter.pl
new file mode 100644
index 0000000000..99e6db94d6
--- /dev/null
+++ b/src/test/subscription/t/013_row_filter.pl
@@ -0,0 +1,96 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+
+my $result = $node_publisher->psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 DROP TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')");
+is($result, 3, "syntax error for ALTER PUBLICATION DROP TABLE");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)");
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3");
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)");
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 10)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+#$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab_rowfilter_1");
+is($result, qq(1980|not filtered
+1001|test 1001
+1002|test 1002), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(7|2|10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.11.0
0008-Debug-for-row-filtering.patchapplication/octet-stream; name=0008-Debug-for-row-filtering.patchDownload
From fb30e58b819e6c65736c35192c114b5239ccd01c Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Wed, 14 Mar 2018 00:53:17 +0000
Subject: [PATCH 8/8] Debug for row filtering
---
src/backend/commands/publicationcmds.c | 11 +++++
src/backend/replication/logical/tablesync.c | 1 +
src/backend/replication/pgoutput/pgoutput.c | 66 +++++++++++++++++++++++++++++
3 files changed, 78 insertions(+)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f6a91a33f6..17167a23ab 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -337,6 +337,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ ListCell *lc;
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -348,6 +349,16 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause == NULL)
+ elog(DEBUG3, "publication \"%s\" has no WHERE clause", NameStr(pubform->pubname));
+ else
+ elog(DEBUG3, "publication \"%s\" has WHERE clause", NameStr(pubform->pubname));
+ }
+
/*
* ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
* publication_table_list node (that accepts a WHERE clause) but forbid
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 5d58d1cd66..3af9b2015c 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -866,6 +866,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
}
+ elog(DEBUG2, "COPY for initial synchronization: %s", cmd.data);
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9500c5e52a..a57557d209 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -30,6 +30,7 @@
#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
+#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -319,6 +320,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
MemoryContext old;
RelationSyncEntry *relentry;
+ Form_pg_class class_form;
+ char *schemaname;
+ char *tablename;
+
if (!is_publishable_relation(relation))
return;
@@ -343,6 +348,17 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ class_form = RelationGetForm(relation);
+ schemaname = get_namespace_name(class_form->relnamespace);
+ tablename = NameStr(class_form->relname);
+
+ if (change->action == REORDER_BUFFER_CHANGE_INSERT)
+ elog(DEBUG1, "INSERT \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_UPDATE)
+ elog(DEBUG1, "UPDATE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+ else if (change->action == REORDER_BUFFER_CHANGE_DELETE)
+ elog(DEBUG1, "DELETE \"%s\".\"%s\" txid: %u", schemaname, tablename, txn->xid);
+
/* ... then check row filter */
if (list_length(relentry->qual) > 0)
{
@@ -360,6 +376,42 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
+#ifdef _NOT_USED
+ if (old_tuple)
+ {
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Form_pg_attribute attr;
+ HeapTuple type_tuple;
+ Oid typoutput;
+ bool typisvarlena;
+ bool isnull;
+ Datum val;
+ char *outputstr = NULL;
+
+ attr = TupleDescAttr(tupdesc, i);
+
+ /* Figure out type name */
+ type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
+ if (HeapTupleIsValid(type_tuple))
+ {
+ /* Get information needed for printing values of a type */
+ getTypeOutputInfo(attr->atttypid, &typoutput, &typisvarlena);
+
+ val = heap_getattr(old_tuple, i + 1, tupdesc, &isnull);
+ if (!isnull)
+ {
+ outputstr = OidOutputFunctionCall(typoutput, val);
+ elog(DEBUG2, "row filter: REPLICA IDENTITY %s: %s", NameStr(attr->attname), outputstr);
+ pfree(outputstr);
+ }
+ }
+ }
+ }
+#endif
+
/* prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -375,6 +427,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Oid expr_type;
Datum res;
bool isnull;
+ char *s = NULL;
qual = (Node *) lfirst(lc);
@@ -385,12 +438,22 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
expr_state = ExecInitExpr(expr, NULL);
res = ExecEvalExpr(expr_state, ecxt, &isnull);
+ elog(DEBUG3, "row filter: result: %s ; isnull: %s", (DatumGetBool(res)) ? "true" : "false", (isnull) ? "true" : "false");
+
/* if tuple does not match row filter, bail out */
if (!DatumGetBool(res) || isnull)
{
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was not matched", s);
+ pfree(s);
+
matched = false;
break;
}
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(qual)), ObjectIdGetDatum(relentry->relid)));
+ elog(DEBUG2, "row filter \"%s\" was matched", s);
+ pfree(s);
}
MemoryContextSwitchTo(oldcxt);
@@ -664,10 +727,13 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
char *s = TextDatumGetCString(rf_datum);
+ char *t = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, rf_datum, ObjectIdGetDatum(entry->relid)));
Node *rf_node = stringToNode(s);
entry->qual = lappend(entry->qual, rf_node);
MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG2, "row filter \"%s\" found for publication \"%s\" and relation \"%s\"", t, pub->name, get_rel_name(relid));
}
ReleaseSysCache(rf_tuple);
--
2.11.0
0006-Print-publication-WHERE-condition-in-psql.patchapplication/octet-stream; name=0006-Print-publication-WHERE-condition-in-psql.patchDownload
From 3d72495c8d5089835eebef0f91911768dba0dcff Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Thu, 17 May 2018 20:52:28 +0000
Subject: [PATCH 6/8] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index b3b9313b36..28a959ce3b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5873,7 +5873,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5903,6 +5904,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.11.0
0007-Publication-where-condition-support-for-pg_dump.patchapplication/octet-stream; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From edf803dcbff0f3f2429380d795f34a87ffc708b6 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler@timbira.com.br>
Date: Sat, 15 Sep 2018 02:52:00 +0000
Subject: [PATCH 7/8] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bf69adc2f4..0eddf204f6 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3956,6 +3956,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -3988,7 +3989,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prqual, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -4009,6 +4011,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4024,6 +4027,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4052,8 +4060,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 7b2c1524a5..e0ba005d75 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -611,6 +611,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.11.0
On Mon, Nov 25, 2019 at 11:48:29AM +0900, Amit Langote wrote:
On Mon, Nov 25, 2019 at 11:38 AM Amit Langote <amitlangote09@gmail.com> wrote:
Needed to be rebased, which I did, to be able to test them; patches attached.
Oops, really attached this time.
Euler, this thread is waiting for input from you regarding the latest
comments from Amit.
--
Michael
On Thu, Nov 28, 2019 at 11:32:01AM +0900, Michael Paquier wrote:
On Mon, Nov 25, 2019 at 11:48:29AM +0900, Amit Langote wrote:
On Mon, Nov 25, 2019 at 11:38 AM Amit Langote <amitlangote09@gmail.com> wrote:
Needed to be rebased, which I did, to be able to test them; patches attached.
Oops, really attached this time.
Euler, this thread is waiting for input from you regarding the latest
comments from Amit.
Euler, this patch is still in "waiting on author" since 11/25. Do you
plan to review changes made by Amit in the patches he submitted, or what
are your plans with this patch?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Em qui., 16 de jan. de 2020 às 18:57, Tomas Vondra
<tomas.vondra@2ndquadrant.com> escreveu:
Euler, this patch is still in "waiting on author" since 11/25. Do you
plan to review changes made by Amit in the patches he submitted, or what
are your plans with this patch?
Yes, I'm working on Amit suggestions. I'll post a new patch as soon as possible.
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On Fri, 17 Jan 2020 at 07:58, Euler Taveira <euler@timbira.com.br> wrote:
Em qui., 16 de jan. de 2020 às 18:57, Tomas Vondra
<tomas.vondra@2ndquadrant.com> escreveu:Euler, this patch is still in "waiting on author" since 11/25. Do you
plan to review changes made by Amit in the patches he submitted, or what
are your plans with this patch?Yes, I'm working on Amit suggestions. I'll post a new patch as soon as possible.
Great. I think this'd be nice to see.
Were you able to fully address the following points that came up in
the discussion?
* Make sure row filters cannot access non-catalog, non-user-catalog
relations i.e. can only use RelationIsAccessibleInLogicalDecoding rels
* Prevent filters from attempting to access attributes that may not be
WAL-logged in a given change record, or give them a way to test for
this. Unchanged TOASTed atts are not logged. There's also REPLICA
IDENTITY FULL to consider if exposing access to the old tuple in the
filter.
Also, while I'm not sure if it was raised earlier, experience with row
filtering in pglogical has shown that error handling is challenging.
Because row filters are read from a historic snapshot of the catalogs
you cannot change them or any SQL or plpgsql functions they use if a
problem causes an ERROR when executing the filter expression. You can
fix the current snapshot's definition but the decoding session won't
see it and will continue to ERROR. We don't really have a good answer
for that yet in pglogical; right now you have to either intervene with
low level tools or drop the subscription and re-create it. Neither of
which is ideal.
You can't just read the row filter from the current snapshot as the
relation definition (atts etc) may not match. Plus that creates a
variety of issues with which txns get which version of a row filter
applied during decoding, consistency between multiple subscribers,
etc.
One option I've thought about was a GUC that allows users to specify
what should be done for errors in row filter expressions: drop the row
as if the filter rejected it; pass the row as if the filter matched;
propagate the ERROR and end the decoding session (default).
I'd welcome ideas about this one. I don't think it's a showstopper for
accepting the feature either, we just have to document that great care
is required with any operator or function that could raise an error in
a row filter. But there are just so many often non-obvious ways you
can land up with an ERROR being thrown that I think it's a bit of a
user foot-gun.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
Hi Euler,
On 1/21/20 2:32 AM, Craig Ringer wrote:
On Fri, 17 Jan 2020 at 07:58, Euler Taveira <euler@timbira.com.br> wrote:
Em qui., 16 de jan. de 2020 às 18:57, Tomas Vondra
<tomas.vondra@2ndquadrant.com> escreveu:Euler, this patch is still in "waiting on author" since 11/25. Do you
plan to review changes made by Amit in the patches he submitted, or what
are your plans with this patch?Yes, I'm working on Amit suggestions. I'll post a new patch as soon as possible.
Great. I think this'd be nice to see.
The last CF for PG13 has started. Do you have a new patch ready?
Regards,
--
-David
david@pgmasters.net
On 3/3/20 12:39 PM, David Steele wrote:
Hi Euler,
On 1/21/20 2:32 AM, Craig Ringer wrote:
On Fri, 17 Jan 2020 at 07:58, Euler Taveira <euler@timbira.com.br> wrote:
Em qui., 16 de jan. de 2020 às 18:57, Tomas Vondra
<tomas.vondra@2ndquadrant.com> escreveu:Euler, this patch is still in "waiting on author" since 11/25. Do you
plan to review changes made by Amit in the patches he submitted, or
what
are your plans with this patch?Yes, I'm working on Amit suggestions. I'll post a new patch as soon
as possible.Great. I think this'd be nice to see.
The last CF for PG13 has started. Do you have a new patch ready?
I have marked this patch Returned with Feedback since no new patch has
been posted.
Please submit to a future CF when a new patch is available.
Regards,
--
-David
david@pgmasters.net
Hi all,
I'm also interested in this patch. I rebased the changes to the current
master branch and attached. The rebase had two issues. First, patch-8 was
conflicting, and that seems only helpful for debugging purposes during
development. So, I dropped it for simplicity. Second, the changes have a
conflict with `publish_via_partition_root` changes. I tried to fix the
issues, but ended-up having a limitation for now. The limitation is that
"cannot create publication with WHERE clause on the partitioned table
without publish_via_partition_root is set to true". This restriction can be
lifted, though I left out for the sake of focusing on the some issues that
I observed on this patch.
Please see my review:
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ?
&change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate,
tupdesc, &TTSOpsHeapTuple);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple,
ecxt->ecxt_scantuple, false);
+
+ foreach(lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual,
expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
The above part can be considered the core of the logic, executed per tuple.
As far as I can see, it has two downsides.
First, calling `expression_planner()` for every tuple can be quite
expensive. I created a sample table, loaded data and ran a quick benchmark
to see its effect. I attached the very simple script that I used to
reproduce the issue on my laptop. I'm pretty sure you can find nicer ways
of doing similar perf tests, just sharing as a reference.
The idea of the test is to add a WHERE clause to a table, but none of the
tuples are filtered out. They just go through this code-path and send it to
the remote node.
#rows Patched | Master
1M 00:00:25.067536 | 00:00:16.633988
10M 00:04:50.770791 | 00:02:40.945358
So, it seems a significant overhead to me. What do you think?
Secondly, probably more importantly, allowing any operator is as dangerous
as allowing any function as users can create/overload operator(s). For
example, assume that users create an operator which modifies the table that
is being filtered out:
```
CREATE OR REPLACE FUNCTION function_that_modifies_table(left_art INTEGER,
right_arg INTEGER)
RETURNS BOOL AS
$$
BEGIN
INSERT INTO test SELECT * FROM test;
return left_art > right_arg;
END;
$$ LANGUAGE PLPGSQL VOLATILE;
CREATE OPERATOR >>= (
PROCEDURE = function_that_modifies_table,
LEFTARG = INTEGER,
RIGHTARG = INTEGER
);
CREATE PUBLICATION pub FOR TABLE test WHERE (key >>= 0);
``
With the above, we seem to be in trouble. Although the above is an extreme
example, it felt useful to share to the extent of the problem. We probably
cannot allow any free-form SQL to be on the filters.
To overcome these issues, one approach could be to rely on known safe
operators and functions. I believe the btree and hash operators should
provide a pretty strong coverage across many use cases. As far as I can
see, the procs that the following query returns can be our baseline:
```
select DISTINCT amproc.amproc::regproc AS opfamily_procedure
from pg_am am,
pg_opfamily opf,
pg_amproc amproc
where opf.opfmethod = am.oid
and amproc.amprocfamily = opf.oid
order by
opfamily_procedure;
```
With that, we aim to prevent users easily shooting themselves by the foot.
The other problematic area was the performance, as calling
`expression_planner()` for every tuple can be very expensive. To avoid
that, it might be considered to ask users to provide a function instead of
a free form WHERE clause, such that if the function returns true, the tuple
is sent. The allowed functions need to be immutable SQL functions with bool
return type. As we can parse the SQL functions, we should be able to allow
only functions that rely on the above mentioned procs. We can apply as many
restrictions (such as no modification query) as possible. For example, see
below:
```
CREATE OR REPLACE function filter_tuples_for_test(int) returns bool as
$body$
select $1 > 100;
$body$
language sql immutable;
CREATE PUBLICATION pub FOR TABLE test FILTER = filter_tuples_for_tes(key);
```
In terms of performance, calling the function should avoid calling the
`expression_planner()` and yield better performance. Though, this needs to
be verified.
If such an approach makes sense, I'd be happy to work on the patch. Please
provide me feedback.
Thanks,
Onder KALACI
Software Engineer at Microsoft &
Developing the Citus database extension for PostgreSQL
David Steele <david@pgmasters.net>, 16 Ara 2020 Çar, 21:43 tarihinde şunu
yazdı:
Show quoted text
On 3/3/20 12:39 PM, David Steele wrote:
Hi Euler,
On 1/21/20 2:32 AM, Craig Ringer wrote:
On Fri, 17 Jan 2020 at 07:58, Euler Taveira <euler@timbira.com.br>
wrote:
Em qui., 16 de jan. de 2020 às 18:57, Tomas Vondra
<tomas.vondra@2ndquadrant.com> escreveu:Euler, this patch is still in "waiting on author" since 11/25. Do you
plan to review changes made by Amit in the patches he submitted, or
what
are your plans with this patch?Yes, I'm working on Amit suggestions. I'll post a new patch as soon
as possible.Great. I think this'd be nice to see.
The last CF for PG13 has started. Do you have a new patch ready?
I have marked this patch Returned with Feedback since no new patch has
been posted.Please submit to a future CF when a new patch is available.
Regards,
--
-David
david@pgmasters.net
Attachments:
0001-Subject-PATCH-1-8-Remove-unused-atttypmod-column-fro.patchapplication/octet-stream; name=0001-Subject-PATCH-1-8-Remove-unused-atttypmod-column-fro.patchDownload
From a8eebea2a2ed9f019657fb09e0e7436c78c46003 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:42:37 +0300
Subject: [PATCH 1/7] Subject: [PATCH 1/8] Remove unused atttypmod column from
initial table synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 1904f3471c..7357458db9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -641,7 +641,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
- Oid attrRow[] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -686,7 +686,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -719,7 +718,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.19.0
0003-Subject-PATCH-3-8-Refactor-function-create_estate_fo.patchapplication/octet-stream; name=0003-Subject-PATCH-3-8-Refactor-function-create_estate_fo.patchDownload
From f93e0702377b2cd74201de25ea7516f335c10200 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:43:51 +0300
Subject: [PATCH 3/7] Subject: [PATCH 3/8] Refactor function
create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8c7fad8f74..e742eceb71 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -341,7 +341,7 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -1176,7 +1176,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1301,7 +1301,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1457,7 +1457,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.19.0
0004-Subject-PATCH-4-8-Rename-a-WHERE-node.patchapplication/octet-stream; name=0004-Subject-PATCH-4-8-Rename-a-WHERE-node.patchDownload
From e152642558d9b423f112d48870f4fcd6e3ec51c6 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:44:15 +0300
Subject: [PATCH 4/7] Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ecff4cd2ac..c0bb44a85c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -484,7 +484,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3756,7 +3756,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3858,7 +3858,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.19.0
0002-Subject-PATCH-2-8-Store-number-of-tuples-in-WalRcvEx.patchapplication/octet-stream; name=0002-Subject-PATCH-2-8-Store-number-of-tuples-in-WalRcvEx.patchDownload
From 31bcef5516beee095d5afe0fa0090183dbfdff9d Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:43:23 +0300
Subject: [PATCH 2/7] Subject: [PATCH 2/8] Store number of tuples in
WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 5 +++--
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 24f8b3e42e..15a781fcc3 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -920,6 +920,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -930,7 +931,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -939,7 +940,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
char *cstrs[MaxTupleAttributeNumber];
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7357458db9..8b0d2b13ac 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -704,9 +704,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 1b05b39df4..ac0d7bf730 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -213,6 +213,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* WAL receiver - libpqwalreceiver hooks */
--
2.19.0
0005-Subject-PATCH-5-8-Row-filtering-for-logical-replicat.patchapplication/octet-stream; name=0005-Subject-PATCH-5-8-Row-filtering-for-logical-replicat.patchDownload
From 358a6a0067550f0ded23c7676557b09ebdbae98f Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:44:50 +0300
Subject: [PATCH 5/7] Subject: [PATCH 5/8] Row filtering for logical
replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 21 +-
src/backend/catalog/pg_publication.c | 207 +++++++++++++++++---
src/backend/commands/publicationcmds.c | 95 ++++++---
src/backend/parser/gram.y | 25 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 122 ++++++++++--
src/backend/replication/logical/worker.c | 6 +-
src/backend/replication/pgoutput/pgoutput.c | 110 ++++++++++-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 5 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 29 +++
src/test/regress/sql/publication.sql | 21 ++
20 files changed, 626 insertions(+), 85 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 79069ddfab..b48f97d82e 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5609,6 +5609,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
The expression tree to be added to the WITH CHECK qualifications for queries that attempt to add rows to the table
</para></entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
+ qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index c2946dfe0f..ae4da00711 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..4b015b37f3 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -182,6 +182,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +204,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +221,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 09946be788..3ef427e16c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,11 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -42,6 +47,9 @@
#include "utils/rel.h"
#include "utils/syscache.h"
+static List * PublicationPartitionedRelationGetRelations(Oid relationId,
+ PublicationPartOpt pub_partopt);
+
/*
* Check if relation can be in given publication and throws appropriate
* error if not.
@@ -141,18 +149,22 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
+ ParseNamespaceItem *pitem;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -172,10 +184,41 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
+ }
+
+ check_publication_add_relation(targetrel->relation);
+
+ if (get_rel_relkind(relid) == RELKIND_PARTITIONED_TABLE && !pub->pubviaroot &&
+ targetrel->whereClause)
+ {
+ table_close(rel, RowExclusiveLock);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("cannot create publication \"%s\" with WHERE clause on partitioned table "
+ "\"%s\" without publish_via_partition_root is true", pub->name,
+ RelationGetRelationName(targetrel->relation))));
}
- check_publication_add_relation(targetrel);
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ pitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ rte = pitem->p_rte;
+
+ addNSItemToQuery(pstate, pitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -189,6 +232,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,11 +254,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -271,31 +326,136 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
- if (get_rel_relkind(pubrel->prrelid) == RELKIND_PARTITIONED_TABLE &&
- pub_partopt != PUBLICATION_PART_ROOT)
+ if (get_rel_relkind(pubrel->prrelid) != RELKIND_PARTITIONED_TABLE)
+ result = lappend_oid(result, pubrel->prrelid);
+ else
{
- List *all_parts = find_all_inheritors(pubrel->prrelid, NoLock,
- NULL);
+ List *all_parts = PublicationPartitionedRelationGetRelations(pubrel->prrelid, pub_partopt);
- if (pub_partopt == PUBLICATION_PART_ALL)
- result = list_concat(result, all_parts);
- else if (pub_partopt == PUBLICATION_PART_LEAF)
- {
- ListCell *lc;
+ result = list_concat(result, all_parts);
+ }
+ }
- foreach(lc, all_parts)
- {
- Oid partOid = lfirst_oid(lc);
+ systable_endscan(scan);
+ table_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
+
+/*
+ * For the input partitionedRelationId and pub_partopt, return list of relations
+ * that should be used for the publication.
+ *
+ */
+static List *
+PublicationPartitionedRelationGetRelations(Oid partitionedRelationId,
+ PublicationPartOpt pub_partopt)
+{
+ AssertArg(get_rel_relkind(partitionedRelationId) == RELKIND_PARTITIONED_TABLE);
+
+ List *result = NIL;
+ List *all_parts = NIL;
+ if (pub_partopt == PUBLICATION_PART_ROOT)
+ return list_make1_oid(partitionedRelationId);
- if (get_rel_relkind(partOid) != RELKIND_PARTITIONED_TABLE)
- result = lappend_oid(result, partOid);
- }
+ all_parts = find_all_inheritors(partitionedRelationId, NoLock, NULL);
+ if (pub_partopt == PUBLICATION_PART_ALL)
+ result = list_concat(result, all_parts);
+ else if (pub_partopt == PUBLICATION_PART_LEAF)
+ {
+ ListCell *lc;
+
+ foreach(lc, all_parts)
+ {
+ Oid partOid = lfirst_oid(lc);
+
+ if (get_rel_relkind(partOid) != RELKIND_PARTITIONED_TABLE)
+ {
+ result = lappend_oid(result, partOid);
}
- else
- Assert(false);
+ }
+ }
+
+ return result;
+}
+
+
+/*
+ * Gets list of PublicationRelationQuals for a publication.
+ *
+ * This should only be used for normal publications, the FOR ALL TABLES
+ * the WHERE clause cannot be used, hence this function should not be
+ * called.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid, PublicationPartOpt pub_partopt)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
}
else
- result = lappend_oid(result, pubrel->prrelid);
+ qual_expr = NULL;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ if (get_rel_relkind(pubrel->prrelid) != RELKIND_PARTITIONED_TABLE)
+ {
+ PublicationRelationQual *relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(pubrel->prrelid, ShareUpdateExclusiveLock);
+ relqual->whereClause = copyObject(qual_expr);
+
+ result = lappend(result, relqual);
+ }
+ else
+ {
+ List *all_parts =
+ PublicationPartitionedRelationGetRelations(pubrel->prrelid, pub_partopt);
+ ListCell *lc;
+
+ foreach(lc, all_parts)
+ {
+ Oid partOid = lfirst_oid(lc);
+
+ PublicationRelationQual *relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(partOid, NoLock);
+
+ /* for all partitions, use the same qual */
+ relqual->whereClause = copyObject(qual_expr);
+ result = lappend(result, relqual);
+ }
+ }
}
systable_endscan(scan);
@@ -304,6 +464,7 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
return result;
}
+
/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index eabbc7473b..ffc1d14ec7 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -372,6 +372,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid
+ * the WHERE clause in it. The use of relation_expr_list node just for
+ * the DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -380,48 +402,59 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
PublicationDropTables(pubid, rels, false);
else /* DEFELEM_SET */
{
- List *oldrelids = GetPublicationRelations(pubid,
- PUBLICATION_PART_ROOT);
+ List *oldrelquals = GetPublicationRelationQuals(pubid,
+ PUBLICATION_PART_ROOT);
List *delrels = NIL;
- ListCell *oldlc;
+ ListCell *oldrelqualc;
/* Calculate which relations to drop. */
- foreach(oldlc, oldrelids)
+ foreach(oldrelqualc, oldrelquals)
{
- Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelationQual *oldrelqual = lfirst(oldrelqualc);
+ PublicationRelationQual *newrelqual;
ListCell *newlc;
bool found = false;
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ newrelqual = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrelqual->relation) == RelationGetRelid(oldrelqual->relation))
{
found = true;
break;
}
}
- if (!found)
+
+ /*
+ * Remove publication / relation mapping iif (i) table is not
+ * found in the new list or (ii) table is found in the new list,
+ * however, its qual does not match the old one (in this case, a
+ * simple tuple update is not enough because of the dependencies).
+ */
+ if (!found || (found && !equal(oldrelqual->whereClause, newrelqual->whereClause)))
{
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
+ PublicationRelationQual *oldrelqual2 = palloc(sizeof(PublicationRelationQual));
- delrels = lappend(delrels, oldrel);
+ oldrelqual2->relation = table_open(RelationGetRelid(oldrelqual->relation),
+ ShareUpdateExclusiveLock);
+
+ delrels = lappend(delrels, oldrelqual2);
}
}
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
+ CloseTableList(oldrelquals);
+ CloseTableList(delrels);
+
/*
* Don't bother calculating the difference for adding, we'll catch and
* skip existing ones when doing catalog update.
*/
PublicationAddTables(pubid, rels, true, stmt);
-
- CloseTableList(delrels);
}
CloseTableList(rels);
@@ -509,13 +542,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -538,8 +573,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +609,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +634,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,13 +655,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -644,8 +687,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -658,7 +701,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c0bb44a85c..af7cec58e7 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -414,13 +414,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
+ publication_table_list
vacuum_relation_list opt_vacuum_relation_list
drop_option_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9446,7 +9447,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9477,7 +9478,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9485,7 +9486,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9493,7 +9494,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9503,6 +9504,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 783f3fe8f2..722272f2ba 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 36002f059d..c5bc464806 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -169,6 +169,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in WHERE"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -567,6 +574,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1889,6 +1897,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3488,6 +3499,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 23ac2a2fe6..a793c3bf79 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2527,6 +2527,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 8b0d2b13ac..a43a7d011f 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -631,19 +631,26 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
++ * qualifications to be used in COPY.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -708,20 +715,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -729,12 +736,85 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
walrcv_clear_result(res);
pfree(cmd.data);
}
+static char *
+TableQualToText(List *qual)
+{
+ StringInfoData cmd;
+ ListCell *lc;
+ bool first = true;
+
+ if (qual == NIL)
+ {
+ return "true";
+ }
+
+ initStringInfo(&cmd);
+
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ return cmd.data;
+}
+
/*
* Copy existing data of a table from publisher.
*
@@ -745,6 +825,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -753,7 +834,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -762,16 +843,20 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
- * For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * For non-tables or tables with quals, we need to do
+ * COPY (SELECT ...), but we can't just do SELECT * because
+ * we need to not copy generated columns.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -780,9 +865,14 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfo(&cmd, " FROM %s WHERE %s) TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname),
+ TableQualToText(qual));
}
+
+ /* we don't need quals anymore */
+ list_free_deep(qual);
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -797,7 +887,6 @@ copy_table(Relation rel)
(void) addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
@@ -806,6 +895,7 @@ copy_table(Relation rel)
logicalrep_rel_close(relmapentry, NoLock);
}
+
/*
* Start syncing the table in the sync worker.
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e742eceb71..29db29e7ba 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,7 +340,7 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
@@ -350,8 +350,8 @@ create_estate_for_relation(Relation rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c997aed83..1faa6a224c 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,13 +15,23 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+#include "catalog/pg_type.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
#include "replication/logical.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/int8.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -98,6 +108,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -536,6 +547,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach(lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+
+ if (!matched)
+ return;
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -960,6 +1030,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -990,6 +1061,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
bool publish = false;
if (pub->alltables)
@@ -998,11 +1072,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
if (pub->pubviaroot && am_partition)
publish_as_relid = llast_oid(get_partition_ancestors(relid));
}
+ bool ancestor_published = false;
+ Oid ancestorOid = InvalidOid;
if (!publish)
{
- bool ancestor_published = false;
-
/*
* For a partition, check if any of the ancestors are
* published. If so, note down the topmost ancestor that is
@@ -1027,13 +1101,19 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
ancestor_published = true;
if (pub->pubviaroot)
+ {
publish_as_relid = ancestor;
+ }
+
+ ancestorOid = ancestor;
}
}
}
if (list_member_oid(pubids, pub->oid) || ancestor_published)
+ {
publish = true;
+ }
}
/*
@@ -1050,9 +1130,24 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ Oid relToUse = ancestor_published ? ancestorOid : relid;
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relToUse), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prqual, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+
+ entry->qual = lappend(entry->qual, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -1173,5 +1268,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 309d102d7d..3121d93d54 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,12 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -106,11 +112,12 @@ typedef enum PublicationPartOpt
} PublicationPartOpt;
extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
+extern List * GetPublicationRelationQuals(Oid pubid, PublicationPartOpt pub_partopt);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 652cbcd6cb..47a5a9af43 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid; /* Oid of the publication */
Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -44,5 +48,6 @@ DECLARE_UNIQUE_INDEX(pg_publication_rel_oid_index, 6112, on pg_publication_rel u
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
#define PublicationRelPrrelidPrpubidIndexId 6113
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
#endif /* PG_PUBLICATION_REL_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 3684f87a88..a336bf219d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -479,6 +479,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index ec14fc2036..06d5d872d6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3498,12 +3498,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3516,7 +3523,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index d25819aa28..715701a4a7 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -78,6 +78,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 62ddd3c7a2..ce4455439d 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..1e69402adb 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,35 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in WHERE
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3"
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..bad90fbf03 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,27 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
--
2.19.0
0006-Subject-PATCH-6-8-Print-publication-WHERE-condition-.patchapplication/octet-stream; name=0006-Subject-PATCH-6-8-Print-publication-WHERE-condition-.patchDownload
From 09ca8468409fa01ef08331bfac231b2d24449847 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:46:56 +0300
Subject: [PATCH 6/7] Subject: [PATCH 6/8] Print publication WHERE condition in
psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 14150d05a9..60ab245738 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5937,7 +5937,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5967,6 +5968,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.19.0
0007-Publication-where-condition-support-for-pg_dump.patchapplication/octet-stream; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From 22c9af7a2ca768b6c69656579531900c0d291471 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:50:10 +0300
Subject: [PATCH 7/7] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3b36335aa6..d5ab04179b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4071,6 +4071,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -4106,7 +4107,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prqual, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -4127,6 +4129,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4142,6 +4145,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4170,8 +4178,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 317bb83970..e9472d6986 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -616,6 +616,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.19.0
Hi Önder,
On Thu, Dec 17, 2020 at 3:43 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
Hi all,
I'm also interested in this patch. I rebased the changes to the current master branch and attached. The rebase had two issues. First, patch-8 was conflicting, and that seems only helpful for debugging purposes during development. So, I dropped it for simplicity. Second, the changes have a conflict with `publish_via_partition_root` changes. I tried to fix the issues, but ended-up having a limitation for now. The limitation is that "cannot create publication with WHERE clause on the partitioned table without publish_via_partition_root is set to true". This restriction can be lifted, though I left out for the sake of focusing on the some issues that I observed on this patch.
Please see my review:
+ if (list_length(relentry->qual) > 0) + { + HeapTuple old_tuple; + HeapTuple new_tuple; + TupleDesc tupdesc; + EState *estate; + ExprContext *ecxt; + MemoryContext oldcxt; + ListCell *lc; + bool matched = true; + + old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL; + new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL; + tupdesc = RelationGetDescr(relation); + estate = create_estate_for_relation(relation); + + /* prepare context per tuple */ + ecxt = GetPerTupleExprContext(estate); + oldcxt = MemoryContextSwitchTo(estate->es_query_cxt); + ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple); + + ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false); + + foreach(lc, relentry->qual) + { + Node *qual; + ExprState *expr_state; + Expr *expr; + Oid expr_type; + Datum res; + bool isnull; + + qual = (Node *) lfirst(lc); + + /* evaluates row filter */ + expr_type = exprType(qual); + expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1); + expr = expression_planner(expr); + expr_state = ExecInitExpr(expr, NULL); + res = ExecEvalExpr(expr_state, ecxt, &isnull); + + /* if tuple does not match row filter, bail out */ + if (!DatumGetBool(res) || isnull) + { + matched = false; + break; + } + } + + MemoryContextSwitchTo(oldcxt); +The above part can be considered the core of the logic, executed per tuple. As far as I can see, it has two downsides.
First, calling `expression_planner()` for every tuple can be quite expensive. I created a sample table, loaded data and ran a quick benchmark to see its effect. I attached the very simple script that I used to reproduce the issue on my laptop. I'm pretty sure you can find nicer ways of doing similar perf tests, just sharing as a reference.
The idea of the test is to add a WHERE clause to a table, but none of the tuples are filtered out. They just go through this code-path and send it to the remote node.
#rows Patched | Master
1M 00:00:25.067536 | 00:00:16.633988
10M 00:04:50.770791 | 00:02:40.945358So, it seems a significant overhead to me. What do you think?
Secondly, probably more importantly, allowing any operator is as dangerous as allowing any function as users can create/overload operator(s). For example, assume that users create an operator which modifies the table that is being filtered out:
```
CREATE OR REPLACE FUNCTION function_that_modifies_table(left_art INTEGER, right_arg INTEGER)
RETURNS BOOL AS
$$
BEGININSERT INTO test SELECT * FROM test;
return left_art > right_arg;
END;
$$ LANGUAGE PLPGSQL VOLATILE;CREATE OPERATOR >>= (
PROCEDURE = function_that_modifies_table,
LEFTARG = INTEGER,
RIGHTARG = INTEGER
);CREATE PUBLICATION pub FOR TABLE test WHERE (key >>= 0);
``With the above, we seem to be in trouble. Although the above is an extreme example, it felt useful to share to the extent of the problem. We probably cannot allow any free-form SQL to be on the filters.
To overcome these issues, one approach could be to rely on known safe operators and functions. I believe the btree and hash operators should provide a pretty strong coverage across many use cases. As far as I can see, the procs that the following query returns can be our baseline:
```
select DISTINCT amproc.amproc::regproc AS opfamily_procedure
from pg_am am,
pg_opfamily opf,
pg_amproc amproc
where opf.opfmethod = am.oid
and amproc.amprocfamily = opf.oid
order by
opfamily_procedure;
```With that, we aim to prevent users easily shooting themselves by the foot.
The other problematic area was the performance, as calling `expression_planner()` for every tuple can be very expensive. To avoid that, it might be considered to ask users to provide a function instead of a free form WHERE clause, such that if the function returns true, the tuple is sent. The allowed functions need to be immutable SQL functions with bool return type. As we can parse the SQL functions, we should be able to allow only functions that rely on the above mentioned procs. We can apply as many restrictions (such as no modification query) as possible. For example, see below:
```CREATE OR REPLACE function filter_tuples_for_test(int) returns bool as
$body$
select $1 > 100;
$body$
language sql immutable;CREATE PUBLICATION pub FOR TABLE test FILTER = filter_tuples_for_tes(key);
```In terms of performance, calling the function should avoid calling the `expression_planner()` and yield better performance. Though, this needs to be verified.
If such an approach makes sense, I'd be happy to work on the patch. Please provide me feedback.
You sent in your patch to pgsql-hackers on Dec 17, but you did not
post it to the next CommitFest[1]https://commitfest.postgresql.org/31/ (I found the old entry of this
patch[2]https://en.wikipedia.org/wiki/Anywhere_on_Earth but it's marked as "Returned with feedback"). If this was
intentional, then you need to take no action. However, if you want
your patch to be reviewed as part of the upcoming CommitFest, then you
need to add it yourself before 2021-01-01 AoE[3]. Thanks for your
contributions.
Regards,
[1]: https://commitfest.postgresql.org/31/
[2]: https://en.wikipedia.org/wiki/Anywhere_on_Earth
[2]: https://en.wikipedia.org/wiki/Anywhere_on_Earth
--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/
Hi Masahiko,
You sent in your patch to pgsql-hackers on Dec 17, but you did not
post it to the next CommitFest[1] (I found the old entry of this
patch[2] but it's marked as "Returned with feedback"). If this was
intentional, then you need to take no action. However, if you want
your patch to be reviewed as part of the upcoming CommitFest, then you
need to add it yourself before 2021-01-01 AoE[3]. Thanks for your
contributions.
Thanks for letting me know of this, I added this patch to the next commit
fest before 2021-01-01 AoE[3].
I'm also attaching the updated commits so that the tests pass on the CI.
Thanks,
Onder KALACI
Software Engineer at Microsoft &
Developing the Citus database extension for PostgreSQL
Attachments:
0001-Subject-PATCH-1-8-Remove-unused-atttypmod-column-fro.patchapplication/octet-stream; name=0001-Subject-PATCH-1-8-Remove-unused-atttypmod-column-fro.patchDownload
From a8eebea2a2ed9f019657fb09e0e7436c78c46003 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:42:37 +0300
Subject: [PATCH 1/7] Subject: [PATCH 1/8] Remove unused atttypmod column from
initial table synchronization
Since commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920, atttypmod was
added but not used. The removal is safe because COPY from publisher
does not need such information.
---
src/backend/replication/logical/tablesync.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 1904f3471c..7357458db9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -641,7 +641,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
- Oid attrRow[] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -686,7 +686,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -719,7 +718,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.19.0
0004-Subject-PATCH-4-8-Rename-a-WHERE-node.patchapplication/octet-stream; name=0004-Subject-PATCH-4-8-Rename-a-WHERE-node.patchDownload
From e152642558d9b423f112d48870f4fcd6e3ec51c6 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:44:15 +0300
Subject: [PATCH 4/7] Subject: [PATCH 4/8] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ecff4cd2ac..c0bb44a85c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -484,7 +484,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3756,7 +3756,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3858,7 +3858,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.19.0
0002-Subject-PATCH-2-8-Store-number-of-tuples-in-WalRcvEx.patchapplication/octet-stream; name=0002-Subject-PATCH-2-8-Store-number-of-tuples-in-WalRcvEx.patchDownload
From 31bcef5516beee095d5afe0fa0090183dbfdff9d Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:43:23 +0300
Subject: [PATCH 2/7] Subject: [PATCH 2/8] Store number of tuples in
WalRcvExecResult
It seems to be a useful information while allocating memory for queries
that returns more than one row. It reduces memory allocation
for initial table synchronization.
---
src/backend/replication/libpqwalreceiver/libpqwalreceiver.c | 5 +++--
src/backend/replication/logical/tablesync.c | 5 ++---
src/include/replication/walreceiver.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 24f8b3e42e..15a781fcc3 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -920,6 +920,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
errdetail("Expected %d fields, got %d fields.",
nRetTypes, nfields)));
+ walres->ntuples = PQntuples(pgres);
walres->tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Create tuple descriptor corresponding to expected result. */
@@ -930,7 +931,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
/* No point in doing more here if there were no tuples returned. */
- if (PQntuples(pgres) == 0)
+ if (walres->ntuples == 0)
return;
/* Create temporary context for local allocations. */
@@ -939,7 +940,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
ALLOCSET_DEFAULT_SIZES);
/* Process returned rows. */
- for (tupn = 0; tupn < PQntuples(pgres); tupn++)
+ for (tupn = 0; tupn < walres->ntuples; tupn++)
{
char *cstrs[MaxTupleAttributeNumber];
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7357458db9..8b0d2b13ac 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -704,9 +704,8 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch table info for table \"%s.%s\": %s",
nspname, relname, res->err)));
- /* We don't know the number of rows coming, so allocate enough space. */
- lrel->attnames = palloc0(MaxTupleAttributeNumber * sizeof(char *));
- lrel->atttyps = palloc0(MaxTupleAttributeNumber * sizeof(Oid));
+ lrel->attnames = palloc0(res->ntuples * sizeof(char *));
+ lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
natt = 0;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 1b05b39df4..ac0d7bf730 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -213,6 +213,7 @@ typedef struct WalRcvExecResult
char *err;
Tuplestorestate *tuplestore;
TupleDesc tupledesc;
+ int ntuples;
} WalRcvExecResult;
/* WAL receiver - libpqwalreceiver hooks */
--
2.19.0
0003-Subject-PATCH-3-8-Refactor-function-create_estate_fo.patchapplication/octet-stream; name=0003-Subject-PATCH-3-8-Refactor-function-create_estate_fo.patchDownload
From f93e0702377b2cd74201de25ea7516f335c10200 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:43:51 +0300
Subject: [PATCH 3/7] Subject: [PATCH 3/8] Refactor function
create_estate_for_relation
Relation localrel is the only LogicalRepRelMapEntry structure member
that is useful for create_estate_for_relation.
---
src/backend/replication/logical/worker.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8c7fad8f74..e742eceb71 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -341,7 +341,7 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
* This is based on similar code in copy.c
*/
static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -1176,7 +1176,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1301,7 +1301,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1457,7 +1457,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
--
2.19.0
0005-Subject-PATCH-5-8-Row-filtering-for-logical-replicat.patchapplication/octet-stream; name=0005-Subject-PATCH-5-8-Row-filtering-for-logical-replicat.patchDownload
From 8ceba9962853637637f74b173253a28c6e9db658 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:44:50 +0300
Subject: [PATCH 5/7] Subject: [PATCH 5/8] Row filtering for logical
replication
When you define or modify a publication you optionally can filter rows
to be published using a WHERE condition. This condition is any
expression that evaluates to boolean. Only those rows that
satisfy the WHERE condition will be sent to subscribers.
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 21 +-
src/backend/catalog/pg_publication.c | 207 +++++++++++++++++---
src/backend/commands/publicationcmds.c | 95 ++++++---
src/backend/parser/gram.y | 25 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 122 ++++++++++--
src/backend/replication/logical/worker.c | 6 +-
src/backend/replication/pgoutput/pgoutput.c | 110 ++++++++++-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 5 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 29 +++
src/test/regress/sql/publication.sql | 21 ++
20 files changed, 626 insertions(+), 85 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 79069ddfab..b48f97d82e 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5609,6 +5609,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
The expression tree to be added to the WITH CHECK qualifications for queries that attempt to add rows to the table
</para></entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in the form of a
+ <function>nodeToString()</function> representation) for the relation's
+ qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index c2946dfe0f..ae4da00711 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -91,7 +91,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..4b015b37f3 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -182,6 +182,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +204,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +221,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 09946be788..3ef427e16c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,11 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -42,6 +47,9 @@
#include "utils/rel.h"
#include "utils/syscache.h"
+static List * PublicationPartitionedRelationGetRelations(Oid relationId,
+ PublicationPartOpt pub_partopt);
+
/*
* Check if relation can be in given publication and throws appropriate
* error if not.
@@ -141,18 +149,22 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Oid relid = RelationGetRelid(targetrel->relation);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ Node *whereclause;
+ ParseNamespaceItem *pitem;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -172,10 +184,41 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
+ }
+
+ check_publication_add_relation(targetrel->relation);
+
+ if (get_rel_relkind(relid) == RELKIND_PARTITIONED_TABLE && !pub->pubviaroot &&
+ targetrel->whereClause)
+ {
+ table_close(rel, RowExclusiveLock);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("cannot create publication \"%s\" with WHERE clause on partitioned table "
+ "\"%s\" without publish_via_partition_root is true", pub->name,
+ RelationGetRelationName(targetrel->relation))));
}
- check_publication_add_relation(targetrel);
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ pitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ rte = pitem->p_rte;
+
+ addNSItemToQuery(pstate, pitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -189,6 +232,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,11 +254,17 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -271,31 +326,136 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
- if (get_rel_relkind(pubrel->prrelid) == RELKIND_PARTITIONED_TABLE &&
- pub_partopt != PUBLICATION_PART_ROOT)
+ if (get_rel_relkind(pubrel->prrelid) != RELKIND_PARTITIONED_TABLE)
+ result = lappend_oid(result, pubrel->prrelid);
+ else
{
- List *all_parts = find_all_inheritors(pubrel->prrelid, NoLock,
- NULL);
+ List *all_parts = PublicationPartitionedRelationGetRelations(pubrel->prrelid, pub_partopt);
- if (pub_partopt == PUBLICATION_PART_ALL)
- result = list_concat(result, all_parts);
- else if (pub_partopt == PUBLICATION_PART_LEAF)
- {
- ListCell *lc;
+ result = list_concat(result, all_parts);
+ }
+ }
- foreach(lc, all_parts)
- {
- Oid partOid = lfirst_oid(lc);
+ systable_endscan(scan);
+ table_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
+
+/*
+ * For the input partitionedRelationId and pub_partopt, return list of relations
+ * that should be used for the publication.
+ *
+ */
+static List *
+PublicationPartitionedRelationGetRelations(Oid partitionedRelationId,
+ PublicationPartOpt pub_partopt)
+{
+ AssertArg(get_rel_relkind(partitionedRelationId) == RELKIND_PARTITIONED_TABLE);
+
+ List *result = NIL;
+ List *all_parts = NIL;
+ if (pub_partopt == PUBLICATION_PART_ROOT)
+ return list_make1_oid(partitionedRelationId);
- if (get_rel_relkind(partOid) != RELKIND_PARTITIONED_TABLE)
- result = lappend_oid(result, partOid);
- }
+ all_parts = find_all_inheritors(partitionedRelationId, NoLock, NULL);
+ if (pub_partopt == PUBLICATION_PART_ALL)
+ result = list_concat(result, all_parts);
+ else if (pub_partopt == PUBLICATION_PART_LEAF)
+ {
+ ListCell *lc;
+
+ foreach(lc, all_parts)
+ {
+ Oid partOid = lfirst_oid(lc);
+
+ if (get_rel_relkind(partOid) != RELKIND_PARTITIONED_TABLE)
+ {
+ result = lappend_oid(result, partOid);
}
- else
- Assert(false);
+ }
+ }
+
+ return result;
+}
+
+
+/*
+ * Gets list of PublicationRelationQuals for a publication.
+ *
+ * This should only be used for normal publications, the FOR ALL TABLES
+ * the WHERE clause cannot be used, hence this function should not be
+ * called.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid, PublicationPartOpt pub_partopt)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
}
else
- result = lappend_oid(result, pubrel->prrelid);
+ qual_expr = NULL;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ if (get_rel_relkind(pubrel->prrelid) != RELKIND_PARTITIONED_TABLE)
+ {
+ PublicationRelationQual *relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(pubrel->prrelid, ShareUpdateExclusiveLock);
+ relqual->whereClause = copyObject(qual_expr);
+
+ result = lappend(result, relqual);
+ }
+ else
+ {
+ List *all_parts =
+ PublicationPartitionedRelationGetRelations(pubrel->prrelid, pub_partopt);
+ ListCell *lc;
+
+ foreach(lc, all_parts)
+ {
+ Oid partOid = lfirst_oid(lc);
+
+ PublicationRelationQual *relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = table_open(partOid, NoLock);
+
+ /* for all partitions, use the same qual */
+ relqual->whereClause = copyObject(qual_expr);
+ result = lappend(result, relqual);
+ }
+ }
}
systable_endscan(scan);
@@ -304,6 +464,7 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
return result;
}
+
/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index eabbc7473b..ffc1d14ec7 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -372,6 +372,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * ALTER PUBLICATION ... DROP TABLE cannot contain a WHERE clause. Use
+ * publication_table_list node (that accepts a WHERE clause) but forbid
+ * the WHERE clause in it. The use of relation_expr_list node just for
+ * the DROP TABLE part does not worth the trouble.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause for removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -380,48 +402,59 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
PublicationDropTables(pubid, rels, false);
else /* DEFELEM_SET */
{
- List *oldrelids = GetPublicationRelations(pubid,
- PUBLICATION_PART_ROOT);
+ List *oldrelquals = GetPublicationRelationQuals(pubid,
+ PUBLICATION_PART_ROOT);
List *delrels = NIL;
- ListCell *oldlc;
+ ListCell *oldrelqualc;
/* Calculate which relations to drop. */
- foreach(oldlc, oldrelids)
+ foreach(oldrelqualc, oldrelquals)
{
- Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelationQual *oldrelqual = lfirst(oldrelqualc);
+ PublicationRelationQual *newrelqual;
ListCell *newlc;
bool found = false;
foreach(newlc, rels)
{
- Relation newrel = (Relation) lfirst(newlc);
+ newrelqual = (PublicationRelationQual *) lfirst(newlc);
- if (RelationGetRelid(newrel) == oldrelid)
+ if (RelationGetRelid(newrelqual->relation) == RelationGetRelid(oldrelqual->relation))
{
found = true;
break;
}
}
- if (!found)
+
+ /*
+ * Remove publication / relation mapping iif (i) table is not
+ * found in the new list or (ii) table is found in the new list,
+ * however, its qual does not match the old one (in this case, a
+ * simple tuple update is not enough because of the dependencies).
+ */
+ if (!found || (found && !equal(oldrelqual->whereClause, newrelqual->whereClause)))
{
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
+ PublicationRelationQual *oldrelqual2 = palloc(sizeof(PublicationRelationQual));
- delrels = lappend(delrels, oldrel);
+ oldrelqual2->relation = table_open(RelationGetRelid(oldrelqual->relation),
+ ShareUpdateExclusiveLock);
+
+ delrels = lappend(delrels, oldrelqual2);
}
}
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
+ CloseTableList(oldrelquals);
+ CloseTableList(delrels);
+
/*
* Don't bother calculating the difference for adding, we'll catch and
* skip existing ones when doing catalog update.
*/
PublicationAddTables(pubid, rels, true, stmt);
-
- CloseTableList(delrels);
}
CloseTableList(rels);
@@ -509,13 +542,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *relqual;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -538,8 +573,10 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +609,11 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ relqual = palloc(sizeof(PublicationRelationQual));
+ relqual->relation = rel;
+ /* child inherits WHERE clause from parent */
+ relqual->whereClause = t->whereClause;
+ rels = lappend(rels, relqual);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +634,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,13 +655,13 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(RelationGetRelid(rel->relation), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->relation->rd_rel->relkind),
+ RelationGetRelationName(rel->relation));
obj = publication_add_relation(pubid, rel, if_not_exists);
if (stmt)
@@ -644,8 +687,8 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *rel = (PublicationRelationQual *) lfirst(lc);
+ Oid relid = RelationGetRelid(rel->relation);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
ObjectIdGetDatum(relid),
@@ -658,7 +701,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(rel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c0bb44a85c..af7cec58e7 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -414,13 +414,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
+ publication_table_list
vacuum_relation_list opt_vacuum_relation_list
drop_option_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9446,7 +9447,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9477,7 +9478,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9485,7 +9486,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9493,7 +9494,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9503,6 +9504,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 783f3fe8f2..722272f2ba 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 36002f059d..c5bc464806 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -169,6 +169,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in WHERE"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -567,6 +574,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1889,6 +1897,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3488,6 +3499,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 23ac2a2fe6..a793c3bf79 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2527,6 +2527,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 8b0d2b13ac..a43a7d011f 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -631,19 +631,26 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
++ * qualifications to be used in COPY.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[1] = {TEXTOID};
bool isnull;
- int natt;
+ int n;
+ ListCell *lc;
+ bool first;
+
+ /* Avoid trashing relation map cache */
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -708,20 +715,20 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->atttyps = palloc0(res->ntuples * sizeof(Oid));
lrel->attkeys = NULL;
- natt = 0;
+ n = 0;
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- lrel->attnames[natt] =
+ lrel->attnames[n] =
TextDatumGetCString(slot_getattr(slot, 1, &isnull));
Assert(!isnull);
- lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
+ lrel->atttyps[n] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
- lrel->attkeys = bms_add_member(lrel->attkeys, natt);
+ lrel->attkeys = bms_add_member(lrel->attkeys, n);
/* Should never happen. */
- if (++natt >= MaxTupleAttributeNumber)
+ if (++n >= MaxTupleAttributeNumber)
elog(ERROR, "too many columns in remote table \"%s.%s\"",
nspname, relname);
@@ -729,12 +736,85 @@ fetch_remote_table_info(char *nspname, char *relname,
}
ExecDropSingleTupleTableSlot(slot);
- lrel->natts = natt;
+ lrel->natts = n;
+
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
walrcv_clear_result(res);
pfree(cmd.data);
}
+static char *
+TableQualToText(List *qual)
+{
+ StringInfoData cmd;
+ ListCell *lc;
+ bool first = true;
+
+ if (qual == NIL)
+ {
+ return "true";
+ }
+
+ initStringInfo(&cmd);
+
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ return cmd.data;
+}
+
/*
* Copy existing data of a table from publisher.
*
@@ -745,6 +825,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -753,7 +834,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -762,16 +843,20 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* list of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
- * For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * For non-tables or tables with quals, we need to do
+ * COPY (SELECT ...), but we can't just do SELECT * because
+ * we need to not copy generated columns.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -780,9 +865,14 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
- quote_qualified_identifier(lrel.nspname, lrel.relname));
+ appendStringInfo(&cmd, " FROM %s WHERE %s) TO STDOUT",
+ quote_qualified_identifier(lrel.nspname, lrel.relname),
+ TableQualToText(qual));
}
+
+ /* we don't need quals anymore */
+ list_free_deep(qual);
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
@@ -797,7 +887,6 @@ copy_table(Relation rel)
(void) addRangeTableEntryForRelation(pstate, rel, AccessShareLock,
NULL, false, false);
- attnamelist = make_copy_attnamelist(relmapentry);
cstate = BeginCopyFrom(pstate, rel, NULL, NULL, false, copy_read_data, attnamelist, NIL);
/* Do the copy */
@@ -806,6 +895,7 @@ copy_table(Relation rel)
logicalrep_rel_close(relmapentry, NoLock);
}
+
/*
* Start syncing the table in the sync worker.
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e742eceb71..29db29e7ba 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,7 +340,7 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
+EState *
create_estate_for_relation(Relation rel)
{
EState *estate;
@@ -350,8 +350,8 @@ create_estate_for_relation(Relation rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c997aed83..1faa6a224c 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,13 +15,23 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+#include "catalog/pg_type.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
#include "replication/logical.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/int8.h"
+#include "utils/builtins.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -98,6 +108,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -536,6 +547,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
Assert(false);
}
+ /* ... then check row filter */
+ if (list_length(relentry->qual) > 0)
+ {
+ HeapTuple old_tuple;
+ HeapTuple new_tuple;
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool matched = true;
+
+ old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
+ new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
+ tupdesc = RelationGetDescr(relation);
+ estate = create_estate_for_relation(relation);
+
+ /* prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+
+ ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
+
+ foreach(lc, relentry->qual)
+ {
+ Node *qual;
+ ExprState *expr_state;
+ Expr *expr;
+ Oid expr_type;
+ Datum res;
+ bool isnull;
+
+ qual = (Node *) lfirst(lc);
+
+ /* evaluates row filter */
+ expr_type = exprType(qual);
+ expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+ expr = expression_planner(expr);
+ expr_state = ExecInitExpr(expr, NULL);
+ res = ExecEvalExpr(expr_state, ecxt, &isnull);
+
+ /* if tuple does not match row filter, bail out */
+ if (!DatumGetBool(res) || isnull)
+ {
+ matched = false;
+ break;
+ }
+ }
+
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecDropSingleTupleTableSlot(ecxt->ecxt_scantuple);
+ FreeExecutorState(estate);
+
+ if (!matched)
+ return;
+ }
+
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
@@ -960,6 +1030,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -990,6 +1061,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
foreach(lc, data->publications)
{
Publication *pub = lfirst(lc);
+ HeapTuple rf_tuple;
+ Datum rf_datum;
+ bool rf_isnull;
bool publish = false;
if (pub->alltables)
@@ -998,11 +1072,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
if (pub->pubviaroot && am_partition)
publish_as_relid = llast_oid(get_partition_ancestors(relid));
}
+ bool ancestor_published = false;
+ Oid ancestorOid = InvalidOid;
if (!publish)
{
- bool ancestor_published = false;
-
/*
* For a partition, check if any of the ancestors are
* published. If so, note down the topmost ancestor that is
@@ -1027,13 +1101,19 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
ancestor_published = true;
if (pub->pubviaroot)
+ {
publish_as_relid = ancestor;
+ }
+
+ ancestorOid = ancestor;
}
}
}
if (list_member_oid(pubids, pub->oid) || ancestor_published)
+ {
publish = true;
+ }
}
/*
@@ -1050,9 +1130,24 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /* Cache row filters, if available */
+ Oid relToUse = ancestor_published ? ancestorOid : relid;
+ rf_tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relToUse), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rf_tuple))
+ {
+ rf_datum = SysCacheGetAttr(PUBLICATIONRELMAP, rf_tuple, Anum_pg_publication_rel_prqual, &rf_isnull);
+
+ if (!rf_isnull)
+ {
+ MemoryContext oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ char *s = TextDatumGetCString(rf_datum);
+ Node *rf_node = stringToNode(s);
+
+ entry->qual = lappend(entry->qual, rf_node);
+ MemoryContextSwitchTo(oldctx);
+ }
+ ReleaseSysCache(rf_tuple);
+ }
}
list_free(pubids);
@@ -1173,5 +1268,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
*/
hash_seq_init(&status, RelationSyncCache);
while ((entry = (RelationSyncEntry *) hash_seq_search(&status)) != NULL)
+ {
entry->replicate_valid = false;
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+ }
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 309d102d7d..3121d93d54 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,12 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -106,11 +112,12 @@ typedef enum PublicationPartOpt
} PublicationPartOpt;
extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
+extern List * GetPublicationRelationQuals(Oid pubid, PublicationPartOpt pub_partopt);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 652cbcd6cb..47a5a9af43 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid; /* Oid of the publication */
Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -44,5 +48,6 @@ DECLARE_UNIQUE_INDEX(pg_publication_rel_oid_index, 6112, on pg_publication_rel u
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
#define PublicationRelPrrelidPrpubidIndexId 6113
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
#endif /* PG_PUBLICATION_REL_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 3684f87a88..a336bf219d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -479,6 +479,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index ec14fc2036..06d5d872d6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3498,12 +3498,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3516,7 +3523,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index d25819aa28..715701a4a7 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -78,6 +78,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 62ddd3c7a2..ce4455439d 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..f2fc6b7ff9 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,35 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in WHERE
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..bad90fbf03 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,27 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
--
2.19.0
0006-Subject-PATCH-6-8-Print-publication-WHERE-condition-.patchapplication/octet-stream; name=0006-Subject-PATCH-6-8-Print-publication-WHERE-condition-.patchDownload
From 1371a7d12083dcb6058039f1466c4c6144062ab7 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:46:56 +0300
Subject: [PATCH 6/7] Subject: [PATCH 6/8] Print publication WHERE condition in
psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 14150d05a9..60ab245738 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5937,7 +5937,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -5967,6 +5968,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.19.0
0007-Publication-where-condition-support-for-pg_dump.patchapplication/octet-stream; name=0007-Publication-where-condition-support-for-pg_dump.patchDownload
From ef02b722c7630db8725cbe493eb3ec08f96b5642 Mon Sep 17 00:00:00 2001
From: Onder Kalaci <onderkalaci@gmail.com>
Date: Tue, 8 Dec 2020 16:50:10 +0300
Subject: [PATCH 7/7] Publication where condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 15 +++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3b36335aa6..d5ab04179b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4071,6 +4071,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_tableoid;
int i_oid;
int i_pubname;
+ int i_pubrelqual;
int i,
j,
ntups;
@@ -4106,7 +4107,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Get the publication membership for the table. */
appendPQExpBuffer(query,
- "SELECT pr.tableoid, pr.oid, p.pubname "
+ "SELECT pr.tableoid, pr.oid, p.pubname, "
+ "pg_catalog.pg_get_expr(pr.prqual, pr.prrelid) AS pubrelqual "
"FROM pg_publication_rel pr, pg_publication p "
"WHERE pr.prrelid = '%u'"
" AND p.oid = pr.prpubid",
@@ -4127,6 +4129,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_pubname = PQfnumber(res, "pubname");
+ i_pubrelqual = PQfnumber(res, "pubrelqual");
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4142,6 +4145,11 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, j, i_pubrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, j, i_pubrelqual));
+
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
}
@@ -4170,8 +4178,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubrinfo->pubname));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 317bb83970..e9472d6986 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -616,6 +616,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
TableInfo *pubtable;
char *pubname;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.19.0
Hi,
On 2020-12-17 09:43:30 +0300, Önder Kalacı wrote:
The above part can be considered the core of the logic, executed per tuple.
As far as I can see, it has two downsides.First, calling `expression_planner()` for every tuple can be quite
expensive. I created a sample table, loaded data and ran a quick benchmark
to see its effect. I attached the very simple script that I used to
reproduce the issue on my laptop. I'm pretty sure you can find nicer ways
of doing similar perf tests, just sharing as a reference.The idea of the test is to add a WHERE clause to a table, but none of the
tuples are filtered out. They just go through this code-path and send it to
the remote node.#rows Patched | Master
1M 00:00:25.067536 | 00:00:16.633988
10M 00:04:50.770791 | 00:02:40.945358So, it seems a significant overhead to me. What do you think?
That seems almost prohibitively expensive. I think at the very least
some of this work would need to be done in a cached manner, e.g. via
get_rel_sync_entry().
Secondly, probably more importantly, allowing any operator is as dangerous
as allowing any function as users can create/overload operator(s).
That's not safe, indeed. It's not even just create/overloading
operators, as far as I can tell the expression can contain just plain
function calls.
The issue also isn't primarily that the user can overload functions,
it's that logical decoding is a limited environment, and not everything
is safe to do within. You e.g. only catalog tables can be
accessed. Therefore I don't think we can allow arbitrary expressions.
The other problematic area was the performance, as calling
`expression_planner()` for every tuple can be very expensive. To avoid
that, it might be considered to ask users to provide a function instead of
a free form WHERE clause, such that if the function returns true, the tuple
is sent. The allowed functions need to be immutable SQL functions with bool
return type. As we can parse the SQL functions, we should be able to allow
only functions that rely on the above mentioned procs. We can apply as many
restrictions (such as no modification query) as possible. For example, see
below:
```
I don't think that would get us very far.
From a safety aspect: A function's body can be changed by the user at
any time, therefore we cannot rely on analyses of the function's body.
From a performance POV: SQL functions are planned at every invocation,
so that'd not buy us much either.
I think what you would have to do instead is to ensure that the
expression is "simple enough", and then process it into a cheaply
executable format in get_rel_sync_entry(). I'd suggest that in the first
version you just allow a simple ANDed list of 'foo.bar op constant'
expressions.
Does that make sense?
Greetings,
Andres Freund
On Mon, Mar 16, 2020, at 10:58 AM, David Steele wrote:
Please submit to a future CF when a new patch is available.
Hi,
This is another version of the row filter patch. Patch summary:
0001: refactor to remove dead code
0002: grammar refactor for row filter
0003: core code, documentation, and tests
0004: psql code
0005: pg_dump support
0006: debug messages (only for test purposes)
0007: measure row filter overhead (only for test purposes)
From the previous version I incorporated Amit's suggestions [1]/messages/by-id/CA+HiwqG3Jz-cRS=4gqXmZDjDAi==19GvrFCCqAawwHcOCEn4fQ@mail.gmail.com, improve documentation and tests. I refactored to code to make it simple to read (break the row filter code into functions). This new version covers the new parameter publish_via_partition_root that was introduced (cf 83fd4532a7).
Regarding function prohibition, I wouldn't like to open a can of worms (see previous discussions in this thread). Simple expressions covers most of the use cases that I worked with until now. This prohibition can be removed in another patch after some careful analysis.
I did some limited tests and didn't observe some excessive CPU usage while testing this patch tough I agree with Andres that retain some expression context into a cache would certainly speed up this piece of code. I measured the row filter overhead in my i7 (see 0007) and got:
mean: 92.49 us
stddev: 32.63 us
median: 83.45 us
min-max: [11.13 .. 2731.55] us
percentile(95): 117.76 us
[1]: /messages/by-id/CA+HiwqG3Jz-cRS=4gqXmZDjDAi==19GvrFCCqAawwHcOCEn4fQ@mail.gmail.com
--
Euler Taveira
EnterpriseDB: https://www.enterprisedb.com/
Attachments:
0001-Remove-unused-column-from-initial-table-synchronizat.patchtext/x-patch; name=0001-Remove-unused-column-from-initial-table-synchronizat.patchDownload
From 78aa13f958d883f52ef0a9796536adf06cd58273 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 11:13:23 -0300
Subject: [PATCH 1/7] Remove unused column from initial table synchronization
Column atttypmod was added in the commit 7c4f52409a, but it is not used.
The removal is safe because COPY from publisher does not need such
information.
---
src/backend/replication/logical/tablesync.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 863d196fd7..a18f847ade 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -640,7 +640,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
- Oid attrRow[] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -685,7 +685,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -718,7 +717,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.20.1
0002-Rename-a-WHERE-node.patchtext/x-patch; name=0002-Rename-a-WHERE-node.patchDownload
From 13917594bd781cd614875498eacec48ce1d64537 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 11:53:34 -0300
Subject: [PATCH 2/7] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b2f447bf9a..793aac5377 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -485,7 +485,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3805,7 +3805,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3907,7 +3907,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.20.1
0003-Row-filter-for-logical-replication.patchtext/x-patch; name=0003-Row-filter-for-logical-replication.patchDownload
From 4bb2622ba46407989133d8e2766bcac513635a51 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH 3/7] Row filter for logical replication
This feature adds row filter for publication tables. When you define or
modify a publication you can optionally filter rows that does not
satisfy a WHERE condition. It allows you to partially replicate a
database or set of tables. The row filter is per table which means that
you can define different row filters for different tables. A new row
filter can be added simply by informing the WHERE clause after the table
name. The WHERE expression must be enclosed by parentheses.
The WHERE clause should contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, UPDATEs
and DELETEs won't be replicated. For simplicity, functions are not
allowed; it could possibly be addressed in another patch.
If you choose to do the initial table synchronization, only data that
satisfies the row filters is sent. If the subscription has several
publications in which a table has been published with different WHERE
clauses, rows must satisfy all expressions to be copied.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row
filter (if the parameter is false -- default) or the partitioned table
row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 37 +++-
doc/src/sgml/ref/create_subscription.sgml | 8 +-
src/backend/catalog/pg_publication.c | 110 +++++++++++-
src/backend/commands/publicationcmds.c | 96 ++++++----
src/backend/parser/gram.y | 28 ++-
src/backend/parser/parse_agg.c | 10 ++
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 93 +++++++++-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 179 ++++++++++++++++--
src/include/catalog/pg_publication.h | 10 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 32 ++++
src/test/regress/sql/publication.sql | 23 +++
src/test/subscription/t/020_row_filter.pl | 190 ++++++++++++++++++++
22 files changed, 801 insertions(+), 85 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 865e826fb0..1ea4076219 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5493,6 +5493,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
are simple references.
</para></entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..5253037155 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression.
</para>
<para>
@@ -131,9 +135,15 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
+ </para>
+
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise,the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
<para>
@@ -182,6 +192,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +214,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +231,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..b8f4ea5603 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,13 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 84d2efcfd2..fd9549a630 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,11 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +146,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +168,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +179,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +211,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +232,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -304,6 +340,64 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
return result;
}
+/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ PublicationRelationQual *prq;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
+ }
+ else
+ qual_expr = NULL;
+
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = pubrel->prrelid;
+ /* table will be opened in AlterPublicationTables */
+ prq->relation = NULL;
+ prq->whereClause = qual_expr;
+ result = lappend(result, prq);
+ }
+
+ systable_endscan(scan);
+ table_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..a5eccbbfb5 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -372,6 +372,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * Although ALTER PUBLICATION grammar allows WHERE clause to be specified
+ * for DROP TABLE action, it doesn't make sense to allow it. We implement
+ * this restriction here, instead of complicating the grammar to enforce
+ * it.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause when removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -389,27 +411,22 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationQual *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
+ /*
+ * Remove all publication-table mappings. We could possibly
+ * remove (i) tables that are not found in the new table list and
+ * (ii) tables that are being re-added with a different qual
+ * expression. For (ii), simply updating the existing tuple is not
+ * enough, because of qual expression dependencies.
+ */
+ oldrel = palloc(sizeof(PublicationRelationQual));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -509,13 +526,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *prq;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -538,8 +557,11 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = myrelid;
+ prq->relation = rel;
+ prq->whereClause = t->whereClause;
+ rels = lappend(rels, prq);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +594,12 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = childrelid;
+ prq->relation = rel;
+ /* child inherits WHERE clause from parent */
+ prq->whereClause = t->whereClause;
+ rels = lappend(rels, prq);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +620,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(prq->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +641,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(prq->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(prq->relation->rd_rel->relkind),
+ RelationGetRelationName(prq->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, prq, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +673,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(prq->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +686,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(prq->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 793aac5377..39088f1c83 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -416,12 +416,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -3805,7 +3805,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace OptWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -9498,7 +9498,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9529,7 +9529,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9537,7 +9537,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9545,7 +9545,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9555,6 +9555,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 588f005dd9..cf2b3868bd 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -933,6 +940,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_GENERATED_COLUMN:
err = _("window functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 379355f9bf..0f2045363b 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -507,6 +514,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_CALL_ARGUMENT:
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1763,6 +1771,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3044,6 +3055,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "WHERE";
case EXPR_KIND_GENERATED_COLUMN:
return "GENERATED AS";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 07d0013e84..6ba01452a3 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2527,6 +2527,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_GENERATED_COLUMN:
err = _("set-returning functions are not allowed in column generation expressions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index a18f847ade..63d00a75ed 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,19 +630,25 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
+
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -731,6 +737,51 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
walrcv_clear_result(res);
pfree(cmd.data);
}
@@ -745,6 +796,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -753,7 +805,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -762,16 +814,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && list_length(qual) == 0)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -780,9 +839,31 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index eb7db89cef..6092ae0df0 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,8 +340,8 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+EState *
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -350,8 +350,8 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
@@ -1176,7 +1176,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1301,7 +1301,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1458,7 +1458,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 79765f9696..dd6f3bda3a 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,12 +15,22 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+#include "catalog/pg_type.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
@@ -57,6 +67,8 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -98,6 +110,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -121,7 +134,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -489,6 +502,99 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static inline bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (list_length(rowfilter) == 0)
+ return true;
+
+ tupdesc = RelationGetDescr(relation);
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ Oid exprtype;
+ Expr *expr;
+ ExprState *exprstate;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -516,7 +622,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -560,6 +666,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -586,6 +696,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
@@ -608,6 +722,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -655,12 +773,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -670,10 +787,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -941,16 +1058,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -966,6 +1088,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -997,6 +1120,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1056,9 +1182,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1164,6 +1310,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1173,6 +1320,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1190,5 +1339,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 4127611f5a..0263baf72a 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -106,11 +113,12 @@ typedef enum PublicationPartOpt
} PublicationPartOpt;
extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
+extern List *GetPublicationRelationQuals(Oid pubid);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index c79b7fb487..d26673a111 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid; /* Oid of the publication */
Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, on pg_publication_rel using btree(oid oid_ops));
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index caed683ba9..f912c3d6f1 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -480,6 +480,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 068c6ec440..cf9973f8a3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3517,12 +3517,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3535,7 +3542,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index dfc214b06f..ac8ae4fa9c 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -78,6 +78,7 @@ typedef enum ParseExprKind
EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 3f0b3deefb..a59ad2c9c8 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..c8cf1b685e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,38 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: cannot use a WHERE clause when removing table from publication "testpub5"
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..b8c059d44b
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,190 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 6;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
0004-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; name=0004-Print-publication-WHERE-condition-in-psql.patchDownload
From 6983d9831cd4cc8f9d40eaaa28ef74bec05a6147 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:09:19 -0300
Subject: [PATCH 4/7] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 20af5a92b4..8a7113ce15 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6011,7 +6011,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -6041,6 +6042,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.20.1
0005-Publication-WHERE-condition-support-for-pg_dump.patchtext/x-patch; name=0005-Publication-WHERE-condition-support-for-pg_dump.patchDownload
From 8a8d125c3b3f6a14a6560598ac99e2a52f8a0028 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 14:29:59 -0300
Subject: [PATCH 5/7] Publication WHERE condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 14 ++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 39da742e32..50388ae8ca 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4080,6 +4080,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4091,7 +4092,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Collect all publication membership info. */
appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
"FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -4101,6 +4103,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4141,6 +4144,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4173,8 +4180,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 1290f9659b..7927039d1d 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -625,6 +625,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.20.1
0006-Debug-messages.patchtext/x-patch; name=0006-Debug-messages.patchDownload
From 24aea97f9e56755eaefbe6b77d45e0dee8ffb550 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Fri, 29 Jan 2021 23:28:13 -0300
Subject: [PATCH 6/7] Debug messages
---
src/backend/replication/pgoutput/pgoutput.c | 14 ++++++++++++++
src/test/subscription/t/020_row_filter.pl | 1 +
2 files changed, 15 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index dd6f3bda3a..17a728c0bf 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -518,6 +518,10 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+ elog(DEBUG2, "pgoutput_row_filter_exec_expr: ret: %d ; isnull: %d",
+ DatumGetBool(ret) ? 1 : 0,
+ isnull ? 1 : 0);
+
if (isnull)
return false;
@@ -543,6 +547,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
if (list_length(rowfilter) == 0)
return true;
+ elog(DEBUG1, "table %s has row filter", get_rel_name(relation->rd_id));
+
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
@@ -566,6 +572,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
Oid exprtype;
Expr *expr;
ExprState *exprstate;
+ char *s = NULL;
/* Prepare expression for execution */
exprtype = exprType(rfnode);
@@ -585,6 +592,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
/* If the tuple does not match one of the row filters, bail out */
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(rfnode)), ObjectIdGetDatum(relation->rd_id)));
+ if (result)
+ elog(DEBUG2, "pgoutput_row_filter: row filter \"%s\" matched", s);
+ else
+ elog(DEBUG2, "pgoutput_row_filter: row filter \"%s\" not matched", s);
+ pfree(s);
+
if (!result)
break;
}
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
index b8c059d44b..ea1d7c30ae 100644
--- a/src/test/subscription/t/020_row_filter.pl
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -8,6 +8,7 @@ use Test::More tests => 6;
# create publisher node
my $node_publisher = get_new_node('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', 'log_min_messages = DEBUG2');
$node_publisher->start;
# create subscriber node
--
2.20.1
0007-Measure-row-filter-overhead.patchtext/x-patch; name=0007-Measure-row-filter-overhead.patchDownload
From 6c684ac80914a388a942b435d0e78ab327564b6c Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sun, 31 Jan 2021 20:48:43 -0300
Subject: [PATCH 7/7] Measure row filter overhead
---
src/backend/replication/pgoutput/pgoutput.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 17a728c0bf..ebd45c1c84 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -542,6 +542,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+ instr_time start_time;
+ instr_time end_time;
/* Bail out if there is no row filter */
if (list_length(rowfilter) == 0)
@@ -549,6 +551,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
elog(DEBUG1, "table %s has row filter", get_rel_name(relation->rd_id));
+ INSTR_TIME_SET_CURRENT(start_time);
+
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
@@ -606,6 +610,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
ResetExprContext(ecxt);
FreeExecutorState(estate);
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+
+ elog(DEBUG2, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+
return result;
}
--
2.20.1
On Mon, 01 Feb 2021 at 08:23, Euler Taveira <euler@eulerto.com> wrote:
On Mon, Mar 16, 2020, at 10:58 AM, David Steele wrote:
Please submit to a future CF when a new patch is available.
Hi,
This is another version of the row filter patch. Patch summary:
0001: refactor to remove dead code
0002: grammar refactor for row filter
0003: core code, documentation, and tests
0004: psql code
0005: pg_dump support
0006: debug messages (only for test purposes)
0007: measure row filter overhead (only for test purposes)
Thanks for updating the patch. Here are some comments:
(1)
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise,the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
otherwise,the -> otherwise, the
(2)
+ <para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>UPDATE</command> and <command>DELETE</command> operations will not
+ be replicated.
+ </para>
+
IMO we should indent one space here.
(3)
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
Same as (2).
The documentation says:
Columns used in the <literal>WHERE</literal> clause must be part of the
primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
<command>UPDATE</command> and <command>DELETE</command> operations will not
be replicated.
Why we need this limitation? Am I missing something?
When I tested, I find that the UPDATE can be replicated, while the DELETE
cannot be replicated. Here is my test-case:
-- 1. Create tables and publications on publisher
CREATE TABLE t1 (a int primary key, b int);
CREATE TABLE t2 (a int primary key, b int);
INSERT INTO t1 VALUES (1, 11);
INSERT INTO t2 VALUES (1, 11);
CREATE PUBLICATION mypub1 FOR TABLE t1;
CREATE PUBLICATION mypub2 FOR TABLE t2 WHERE (b > 10);
-- 2. Create tables and subscriptions on subscriber
CREATE TABLE t1 (a int primary key, b int);
CREATE TABLE t2 (a int primary key, b int);
CREATE SUBSCRIPTION mysub1 CONNECTION 'host=localhost port=8765 dbname=postgres' PUBLICATION mypub1;
CREATE SUBSCRIPTION mysub2 CONNECTION 'host=localhost port=8765 dbname=postgres' PUBLICATION mypub2;
-- 3. Check publications on publisher
postgres=# \dRp+
Publication mypub1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
japin | f | t | t | t | t | f
Tables:
"public.t1"
Publication mypub2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
japin | f | t | t | t | t | f
Tables:
"public.t2" WHERE (b > 10)
-- 4. Check initialization data on subscriber
postgres=# table t1;
a | b
---+----
1 | 11
(1 row)
postgres=# table t2;
a | b
---+----
1 | 11
(1 row)
-- 5. The update on publisher
postgres=# update t1 set b = 111 where b = 11;
UPDATE 1
postgres=# table t1;
a | b
---+-----
1 | 111
(1 row)
postgres=# update t2 set b = 111 where b = 11;
UPDATE 1
postgres=# table t2;
a | b
---+-----
1 | 111
(1 row)
-- 6. check the updated records on subscriber
postgres=# table t1;
a | b
---+-----
1 | 111
(1 row)
postgres=# table t2;
a | b
---+-----
1 | 111
(1 row)
-- 7. Delete records on publisher
postgres=# delete from t1 where b = 111;
DELETE 1
postgres=# table t1;
a | b
---+---
(0 rows)
postgres=# delete from t2 where b = 111;
DELETE 1
postgres=# table t2;
a | b
---+---
(0 rows)
-- 8. Check the deleted records on subscriber
postgres=# table t1;
a | b
---+---
(0 rows)
postgres=# table t2;
a | b
---+-----
1 | 111
(1 row)
I do a simple debug, and find that the pgoutput_row_filter() return false when I
execute "delete from t2 where b = 111;".
Does the publication only load the REPLICA IDENTITY columns into oldtuple when we
execute DELETE? So the pgoutput_row_filter() cannot find non REPLICA IDENTITY
columns, which cause it return false, right? If that's right, the UPDATE might
not be limitation by REPLICA IDENTITY, because all columns are in newtuple,
isn't it?
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Mon, Feb 1, 2021, at 6:11 AM, japin wrote:
Thanks for updating the patch. Here are some comments:
Thanks for your review. I updated the documentation accordingly.
The documentation says:
Columns used in the <literal>WHERE</literal> clause must be part of the
primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
<command>UPDATE</command> and <command>DELETE</command> operations will not
be replicated.
The UPDATE is an oversight from a previous version.
Does the publication only load the REPLICA IDENTITY columns into oldtuple when we
execute DELETE? So the pgoutput_row_filter() cannot find non REPLICA IDENTITY
columns, which cause it return false, right? If that's right, the UPDATE might
not be limitation by REPLICA IDENTITY, because all columns are in newtuple,
isn't it?
No. oldtuple could possibly be available for UPDATE and DELETE. However, row
filter consider only one tuple for filtering. INSERT has only newtuple; row
filter uses it. UPDATE has newtuple and optionally oldtuple (if it has PK or
REPLICA IDENTITY); row filter uses newtuple. DELETE optionally has only
oldtuple; row filter uses it (if available). Keep in mind, if the expression
evaluates to NULL, it returns false and the row won't be replicated.
After the commit 3696a600e2, the last patch does not apply cleanly. I'm
attaching another version to address the documentation issues.
--
Euler Taveira
EnterpriseDB: https://www.enterprisedb.com/
Attachments:
v10-0001-Remove-unused-column-from-initial-table-synchronizat.patchtext/x-patch; name=v10-0001-Remove-unused-column-from-initial-table-synchronizat.patchDownload
From fb45140efacea0cfc9478f2c3747d9a4e5cecbd0 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 11:13:23 -0300
Subject: [PATCH 1/7] Remove unused column from initial table synchronization
Column atttypmod was added in the commit 7c4f52409a, but it is not used.
The removal is safe because COPY from publisher does not need such
information.
---
src/backend/replication/logical/tablesync.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 863d196fd7..a18f847ade 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -640,7 +640,7 @@ fetch_remote_table_info(char *nspname, char *relname,
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
- Oid attrRow[] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+ Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
bool isnull;
int natt;
@@ -685,7 +685,6 @@ fetch_remote_table_info(char *nspname, char *relname,
appendStringInfo(&cmd,
"SELECT a.attname,"
" a.atttypid,"
- " a.atttypmod,"
" a.attnum = ANY(i.indkey)"
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
@@ -718,7 +717,7 @@ fetch_remote_table_info(char *nspname, char *relname,
Assert(!isnull);
lrel->atttyps[natt] = DatumGetObjectId(slot_getattr(slot, 2, &isnull));
Assert(!isnull);
- if (DatumGetBool(slot_getattr(slot, 4, &isnull)))
+ if (DatumGetBool(slot_getattr(slot, 3, &isnull)))
lrel->attkeys = bms_add_member(lrel->attkeys, natt);
/* Should never happen. */
--
2.20.1
v10-0002-Rename-a-WHERE-node.patchtext/x-patch; name=v10-0002-Rename-a-WHERE-node.patchDownload
From 4902323fca0efa8024de600853670b2a1f0e3ca0 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 11:53:34 -0300
Subject: [PATCH 2/7] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index dd72a9fc3c..ecfd98ba5b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -485,7 +485,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3806,7 +3806,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3908,7 +3908,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.20.1
v10-0003-Row-filter-for-logical-replication.patchtext/x-patch; name=v10-0003-Row-filter-for-logical-replication.patchDownload
From a3693dc370e80166c5fb8ba08c765a1559aaf89c Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH 3/7] Row filter for logical replication
This feature adds row filter for publication tables. When you define or
modify a publication you can optionally filter rows that does not
satisfy a WHERE condition. It allows you to partially replicate a
database or set of tables. The row filter is per table which means that
you can define different row filters for different tables. A new row
filter can be added simply by informing the WHERE clause after the table
name. The WHERE expression must be enclosed by parentheses.
The WHERE clause should contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, UPDATEs
and DELETEs won't be replicated. For simplicity, functions are not
allowed; it could possibly be addressed in another patch.
If you choose to do the initial table synchronization, only data that
satisfies the row filters is sent. If the subscription has several
publications in which a table has been published with different WHERE
clauses, rows must satisfy all expressions to be copied.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row
filter (if the parameter is false -- default) or the partitioned table
row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 36 +++-
doc/src/sgml/ref/create_subscription.sgml | 8 +-
src/backend/catalog/pg_publication.c | 110 +++++++++++-
src/backend/commands/publicationcmds.c | 96 ++++++----
src/backend/parser/gram.y | 28 ++-
src/backend/parser/parse_agg.c | 10 ++
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 93 +++++++++-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 179 ++++++++++++++++--
src/include/catalog/pg_publication.h | 10 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 32 ++++
src/test/regress/sql/publication.sql | 23 +++
src/test/subscription/t/020_row_filter.pl | 190 ++++++++++++++++++++
22 files changed, 800 insertions(+), 85 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 865e826fb0..1ea4076219 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5493,6 +5493,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
are simple references.
</para></entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..2136545b04 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression.
</para>
<para>
@@ -131,9 +135,15 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
+ </para>
+
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise, the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
<para>
@@ -182,6 +192,12 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ Columns used in the <literal>WHERE</literal> clause must be part of the
+ primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
+ <command>DELETE</command> operations will not be replicated.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +213,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +230,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..b8f4ea5603 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,13 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 84d2efcfd2..fd9549a630 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,11 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +146,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +168,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +179,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +211,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +232,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
@@ -304,6 +340,64 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
return result;
}
+/*
+ * Gets list of PublicationRelationQuals for a publication.
+ */
+List *
+GetPublicationRelationQuals(Oid pubid)
+{
+ List *result;
+ Relation pubrelsrel;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ HeapTuple tup;
+
+ /* Find all publications associated with the relation. */
+ pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_publication_rel_prpubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(pubid));
+
+ scan = systable_beginscan(pubrelsrel, PublicationRelPrrelidPrpubidIndexId,
+ true, NULL, 1, &scankey);
+
+ result = NIL;
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_publication_rel pubrel;
+ PublicationRelationQual *prq;
+ Datum value_datum;
+ char *qual_value;
+ Node *qual_expr;
+ bool isnull;
+
+ pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
+
+ value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
+ if (!isnull)
+ {
+ qual_value = TextDatumGetCString(value_datum);
+ qual_expr = (Node *) stringToNode(qual_value);
+ }
+ else
+ qual_expr = NULL;
+
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = pubrel->prrelid;
+ /* table will be opened in AlterPublicationTables */
+ prq->relation = NULL;
+ prq->whereClause = qual_expr;
+ result = lappend(result, prq);
+ }
+
+ systable_endscan(scan);
+ table_close(pubrelsrel, AccessShareLock);
+
+ return result;
+}
+
/*
* Gets list of publication oids for publications marked as FOR ALL TABLES.
*/
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..a5eccbbfb5 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -372,6 +372,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * Although ALTER PUBLICATION grammar allows WHERE clause to be specified
+ * for DROP TABLE action, it doesn't make sense to allow it. We implement
+ * this restriction here, instead of complicating the grammar to enforce
+ * it.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause when removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -389,27 +411,22 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationQual *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
+ /*
+ * Remove all publication-table mappings. We could possibly
+ * remove (i) tables that are not found in the new table list and
+ * (ii) tables that are being re-added with a different qual
+ * expression. For (ii), simply updating the existing tuple is not
+ * enough, because of qual expression dependencies.
+ */
+ oldrel = palloc(sizeof(PublicationRelationQual));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -509,13 +526,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *prq;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -538,8 +557,11 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = myrelid;
+ prq->relation = rel;
+ prq->whereClause = t->whereClause;
+ rels = lappend(rels, prq);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +594,12 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = childrelid;
+ prq->relation = rel;
+ /* child inherits WHERE clause from parent */
+ prq->whereClause = t->whereClause;
+ rels = lappend(rels, prq);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +620,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(prq->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +641,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(prq->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(prq->relation->rd_rel->relkind),
+ RelationGetRelationName(prq->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, prq, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +673,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(prq->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +686,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(prq->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ecfd98ba5b..54d8f26a64 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -416,12 +416,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -3806,7 +3806,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace OptWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -9499,7 +9499,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9530,7 +9530,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9538,7 +9538,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9546,7 +9546,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9556,6 +9556,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index fd08b9eeff..06b184f404 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -940,6 +947,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 6c87783b2c..4b4fb0dc4d 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -508,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1765,6 +1773,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3048,6 +3059,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 37cebc7d82..bf909d6ed5 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2530,6 +2530,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index a18f847ade..63d00a75ed 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,19 +630,25 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
+
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -731,6 +737,51 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
walrcv_clear_result(res);
pfree(cmd.data);
}
@@ -745,6 +796,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -753,7 +805,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -762,16 +814,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && list_length(qual) == 0)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -780,9 +839,31 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index eb7db89cef..6092ae0df0 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,8 +340,8 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+EState *
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -350,8 +350,8 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
@@ -1176,7 +1176,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1301,7 +1301,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1458,7 +1458,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 79765f9696..dd6f3bda3a 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,12 +15,22 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+#include "catalog/pg_type.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
@@ -57,6 +67,8 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -98,6 +110,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -121,7 +134,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -489,6 +502,99 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static inline bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (list_length(rowfilter) == 0)
+ return true;
+
+ tupdesc = RelationGetDescr(relation);
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ Oid exprtype;
+ Expr *expr;
+ ExprState *exprstate;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -516,7 +622,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -560,6 +666,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -586,6 +696,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
@@ -608,6 +722,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -655,12 +773,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -670,10 +787,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -941,16 +1058,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -966,6 +1088,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -997,6 +1120,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1056,9 +1182,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1164,6 +1310,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1173,6 +1320,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1190,5 +1339,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 4127611f5a..0263baf72a 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -106,11 +113,12 @@ typedef enum PublicationPartOpt
} PublicationPartOpt;
extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
+extern List *GetPublicationRelationQuals(Oid pubid);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index c79b7fb487..d26673a111 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid; /* Oid of the publication */
Oid prrelid; /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, on pg_publication_rel using btree(oid oid_ops));
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 40ae489c23..cabda6f3c4 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -482,6 +482,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 236832a2ca..65f7a8884f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3543,12 +3543,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3561,7 +3568,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 176b9f37c1..edb80fbe5d 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -79,6 +79,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 3f0b3deefb..a59ad2c9c8 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..c8cf1b685e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,38 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: cannot use a WHERE clause when removing table from publication "testpub5"
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..b8c059d44b
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,190 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 6;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
v10-0004-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; name=v10-0004-Print-publication-WHERE-condition-in-psql.patchDownload
From 96f5c57bccb4bd9cae8546ecefcca454b628e927 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:09:19 -0300
Subject: [PATCH 4/7] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 20af5a92b4..8a7113ce15 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6011,7 +6011,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -6041,6 +6042,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.20.1
v10-0005-Publication-WHERE-condition-support-for-pg_dump.patchtext/x-patch; name=v10-0005-Publication-WHERE-condition-support-for-pg_dump.patchDownload
From 35b8cc1e0d9ca8e0e5431073cc10b80f5f16972e Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 14:29:59 -0300
Subject: [PATCH 5/7] Publication WHERE condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 14 ++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 39da742e32..50388ae8ca 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4080,6 +4080,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4091,7 +4092,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Collect all publication membership info. */
appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
"FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -4101,6 +4103,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4141,6 +4144,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4173,8 +4180,11 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 1290f9659b..7927039d1d 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -625,6 +625,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.20.1
v10-0006-Debug-messages.patchtext/x-patch; name=v10-0006-Debug-messages.patchDownload
From 7a855df32c456ead8f0634f122c569c82022ba88 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Fri, 29 Jan 2021 23:28:13 -0300
Subject: [PATCH 6/7] Debug messages
---
src/backend/replication/pgoutput/pgoutput.c | 14 ++++++++++++++
src/test/subscription/t/020_row_filter.pl | 1 +
2 files changed, 15 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index dd6f3bda3a..17a728c0bf 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -518,6 +518,10 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+ elog(DEBUG2, "pgoutput_row_filter_exec_expr: ret: %d ; isnull: %d",
+ DatumGetBool(ret) ? 1 : 0,
+ isnull ? 1 : 0);
+
if (isnull)
return false;
@@ -543,6 +547,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
if (list_length(rowfilter) == 0)
return true;
+ elog(DEBUG1, "table %s has row filter", get_rel_name(relation->rd_id));
+
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
@@ -566,6 +572,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
Oid exprtype;
Expr *expr;
ExprState *exprstate;
+ char *s = NULL;
/* Prepare expression for execution */
exprtype = exprType(rfnode);
@@ -585,6 +592,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
/* If the tuple does not match one of the row filters, bail out */
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(rfnode)), ObjectIdGetDatum(relation->rd_id)));
+ if (result)
+ elog(DEBUG2, "pgoutput_row_filter: row filter \"%s\" matched", s);
+ else
+ elog(DEBUG2, "pgoutput_row_filter: row filter \"%s\" not matched", s);
+ pfree(s);
+
if (!result)
break;
}
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
index b8c059d44b..ea1d7c30ae 100644
--- a/src/test/subscription/t/020_row_filter.pl
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -8,6 +8,7 @@ use Test::More tests => 6;
# create publisher node
my $node_publisher = get_new_node('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', 'log_min_messages = DEBUG2');
$node_publisher->start;
# create subscriber node
--
2.20.1
v10-0007-Measure-row-filter-overhead.patchtext/x-patch; name=v10-0007-Measure-row-filter-overhead.patchDownload
From a861145175c04c25fc7db662b55dd7a9cfddbccd Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sun, 31 Jan 2021 20:48:43 -0300
Subject: [PATCH 7/7] Measure row filter overhead
---
src/backend/replication/pgoutput/pgoutput.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 17a728c0bf..ebd45c1c84 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -542,6 +542,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+ instr_time start_time;
+ instr_time end_time;
/* Bail out if there is no row filter */
if (list_length(rowfilter) == 0)
@@ -549,6 +551,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
elog(DEBUG1, "table %s has row filter", get_rel_name(relation->rd_id));
+ INSTR_TIME_SET_CURRENT(start_time);
+
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
@@ -606,6 +610,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
ResetExprContext(ecxt);
FreeExecutorState(estate);
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+
+ elog(DEBUG2, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+
return result;
}
--
2.20.1
On Mon, Feb 01, 2021 at 04:11:50PM -0300, Euler Taveira wrote:
After the commit 3696a600e2, the last patch does not apply cleanly. I'm
attaching another version to address the documentation issues.
I have bumped into this thread, and applied 0001. My guess is that
one of the patches developped originally for logical replication
defined atttypmod in LogicalRepRelation, but has finished by not using
it. Nice catch.
--
Michael
On Tue, 02 Feb 2021 at 03:11, Euler Taveira <euler@eulerto.com> wrote:
On Mon, Feb 1, 2021, at 6:11 AM, japin wrote:
Thanks for updating the patch. Here are some comments:
Thanks for your review. I updated the documentation accordingly.
The documentation says:
Columns used in the <literal>WHERE</literal> clause must be part of the
primary key or be covered by <literal>REPLICA IDENTITY</literal> otherwise
<command>UPDATE</command> and <command>DELETE</command> operations will not
be replicated.The UPDATE is an oversight from a previous version.
Does the publication only load the REPLICA IDENTITY columns into oldtuple when we
execute DELETE? So the pgoutput_row_filter() cannot find non REPLICA IDENTITY
columns, which cause it return false, right? If that's right, the UPDATE might
not be limitation by REPLICA IDENTITY, because all columns are in newtuple,
isn't it?No. oldtuple could possibly be available for UPDATE and DELETE. However, row
filter consider only one tuple for filtering. INSERT has only newtuple; row
filter uses it. UPDATE has newtuple and optionally oldtuple (if it has PK or
REPLICA IDENTITY); row filter uses newtuple. DELETE optionally has only
oldtuple; row filter uses it (if available). Keep in mind, if the expression
evaluates to NULL, it returns false and the row won't be replicated.
Thanks for your clarification.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Tue, 02 Feb 2021 at 13:02, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Feb 01, 2021 at 04:11:50PM -0300, Euler Taveira wrote:
After the commit 3696a600e2, the last patch does not apply cleanly. I'm
attaching another version to address the documentation issues.I have bumped into this thread, and applied 0001. My guess is that
one of the patches developped originally for logical replication
defined atttypmod in LogicalRepRelation, but has finished by not using
it. Nice catch.
Since the 0001 patch already be commited (4ad31bb2ef), we can remove it.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Tue, 02 Feb 2021 at 19:16, japin <japinli@hotmail.com> wrote:
On Tue, 02 Feb 2021 at 13:02, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Feb 01, 2021 at 04:11:50PM -0300, Euler Taveira wrote:
After the commit 3696a600e2, the last patch does not apply cleanly. I'm
attaching another version to address the documentation issues.I have bumped into this thread, and applied 0001. My guess is that
one of the patches developped originally for logical replication
defined atttypmod in LogicalRepRelation, but has finished by not using
it. Nice catch.Since the 0001 patch already be commited (4ad31bb2ef), we can remove it.
In 0003 patch, function GetPublicationRelationQuals() has been defined, but it
never used. So why should we define it?
$ grep 'GetPublicationRelationQuals' -rn src/
src/include/catalog/pg_publication.h:116:extern List *GetPublicationRelationQuals(Oid pubid);
src/backend/catalog/pg_publication.c:347:GetPublicationRelationQuals(Oid pubid)
If we must keep it, here are some comments on it.
(1)
value_datum = heap_getattr(tup, Anum_pg_publication_rel_prqual, RelationGetDescr(pubrelsrel), &isnull);
It looks too long, we can split it into two lines.
(2)
Since qual_value only used in "if (!isnull)" branch, so we can narrow it's scope.
(3)
Should we free the memory for qual_value?
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Tue, Feb 2, 2021, at 8:38 AM, japin wrote:
In 0003 patch, function GetPublicationRelationQuals() has been defined, but it
never used. So why should we define it?
Thanks for taking a look again. It is an oversight. It was introduced in an
attempt to refactor ALTER PUBLICATION SET TABLE. In AlterPublicationTables, we
could possibly keep some publication-table mappings that does not change,
however, since commit 3696a600e2, it is required to find the qual for all
inheritors (see GetPublicationRelations). I explain this decision in the
following comment:
/*
* Remove all publication-table mappings. We could possibly
* remove (i) tables that are not found in the new table list and
* (ii) tables that are being re-added with a different qual
* expression. For (ii), simply updating the existing tuple is not
* enough, because of qual expression dependencies.
*/
I will post a new patch set later.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Hi,
Thanks for working on this. I did some review and testing, please see my
comments below.
1)
0008 is only for debug purposes (I'm
not sure some of these messages will be part of the final patch).
I think if you are planning to keep the debug patch, there seems to be an
area of improvement in the following code:
/* If the tuple does not match one of the row filters, bail out */
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
CStringGetTextDatum(nodeToString(rfnode)),
ObjectIdGetDatum(relation->rd_id)));
+ if (result)
+ elog(DEBUG2, "pgoutput_row_filter: row filter \"%s\" matched", s);
+ else
+ elog(DEBUG2, "pgoutput_row_filter: row filter \"%s\" not matched", s);
+ pfree(s);
+
We only need to calculate "s" if the debug level is DEBU2 or higher. So, we
could maybe do something like this:
if (log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2)
{
/* and the code block is moved to here */
}
2) I have done some tests with some different expressions that don't exist
on the regression tests, just to make sure that we don't have any edge
cases. All seems to work fine for the expressions
like: (column/1.0)::int::bool::text::bool, CASE WHEN column1> 4000 THEN
column2/ 100 > 1 ELSE false END, COALESCE((column/50000)::bool, false),
NULLIF((column/50000)::int::bool, false), column IS DISTINCT FROM 50040,
row(column, 2, 3) > row(2000, 2, 3), (column IS DISTINCT FROM), column IS
NULL, column IS NOT NULL, composite types
3) As another edge case exploration, I tested with tables/types on
different schemas with escape chars on the schemas/custom types etc. All
looks good.
4) In terms of performance, I also separately verified that the overhead
seems pretty low with the final patch. I used the tests in
commands_to_perf_test.sql file which I shared earlier. The steps in the
file do not intend to measure the time precisely per tuple, but just to see
if there is any noticeable regression while moving lots of data. The
difference between (a) no filter (b) simple filter is between %1-%4, which
could even be considered in the noise level.
5) I guess we can by-pass the function limitation via operators. Do you see
anything problematic with that? I think that should be allowed as it helps
power users to create more complex replications if they need.
CREATE FUNCTION simple_f(int, int) RETURNS bool
AS $$ SELECT hashint4($1) > $2 $$
LANGUAGE SQL;
CREATE OPERATOR =*>
(
PROCEDURE = simple_f,
LEFTARG = int,
RIGHTARG = int
);
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key =*> 1000);
5.1) It might be useful to have a regression test which has an user-defined
operator on the WHERE clause, and DROP without cascade is not allowed so
that we cover recordDependencyOnExpr() calls in the tests.
Thanks,
Onder KALACI
Software Engineer at Microsoft
Euler Taveira <euler@eulerto.com>, 2 Şub 2021 Sal, 13:34 tarihinde şunu
yazdı:
Show quoted text
On Tue, Feb 2, 2021, at 8:38 AM, japin wrote:
In 0003 patch, function GetPublicationRelationQuals() has been defined,
but it
never used. So why should we define it?Thanks for taking a look again. It is an oversight. It was introduced in an
attempt to refactor ALTER PUBLICATION SET TABLE. In
AlterPublicationTables, we
could possibly keep some publication-table mappings that does not change,
however, since commit 3696a600e2, it is required to find the qual for all
inheritors (see GetPublicationRelations). I explain this decision in the
following comment:/*
* Remove all publication-table mappings. We could possibly
* remove (i) tables that are not found in the new table list
and
* (ii) tables that are being re-added with a different qual
* expression. For (ii), simply updating the existing tuple is
not
* enough, because of qual expression dependencies.
*/I will post a new patch set later.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Feb 22, 2021, at 7:45 AM, Önder Kalacı wrote:
Thanks for working on this. I did some review and testing, please see my comments below.
I appreciate your review. I'm working on a new patch set and expect to post it soon.
I think if you are planning to keep the debug patch, there seems to be an area of improvement in the following code:
I was not planning to include the debug part, however, it would probably help to
debug some use cases. In the "row [not] matched" message, it should be DEBUG3
for a final version because it is too noisy. Since you mentioned I will inspect
and include in the main patch those DEBUG messages that could possibly be
useful for debug purposes.
4) In terms of performance, I also separately verified that the overhead seems pretty low with the final patch. I used the tests in commands_to_perf_test.sql file which I shared earlier. The steps in the file do not intend to measure the time precisely per tuple, but just to see if there is any noticeable regression while moving lots of data. The difference between (a) no filter (b) simple filter is between %1-%4, which could even be considered in the noise level.
I'm concerned about it too, I'm currently experimenting alternatives to reduce this overhead.
5) I guess we can by-pass the function limitation via operators. Do you see anything problematic with that? I think that should be allowed as it helps power users to create more complex replications if they need.
Yes, you can. I have to check if this user-defined operator could possibly
break the replication. I will make sure to include a test covering this case
too.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Feb 22, 2021, at 9:28 AM, Euler Taveira wrote:
On Mon, Feb 22, 2021, at 7:45 AM, Önder Kalacı wrote:
Thanks for working on this. I did some review and testing, please see my comments below.
I appreciate your review. I'm working on a new patch set and expect to post it soon.
I'm attaching a new patch set. This new version improves documentation and commit
messages and incorporates a few debug messages. I did a couple of tests and
didn't find issues.
Here are some numbers from my i7 using a simple expression (aid > 0) on table
pgbench_accounts.
$ awk '/row filter time/ {print $9}' postgresql.log | /tmp/stat.pl 99
mean: 33.00 us
stddev: 17.65 us
median: 28.83 us
min-max: [3.48 .. 6404.84] us
percentile(99): 49.66 us
mode: 41.71 us
I don't expect 0005 and 0006 to be included. I attached them to help with some
tests.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
0001-Rename-a-WHERE-node.patchtext/x-patch; name=0001-Rename-a-WHERE-node.patchDownload
From 596b278d4be77f67467aa1d61ce26e765b078197 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 11:53:34 -0300
Subject: [PATCH 1/6] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index dd72a9fc3c..ecfd98ba5b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -485,7 +485,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3806,7 +3806,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3908,7 +3908,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.20.1
0002-Row-filter-for-logical-replication.patchtext/x-patch; name=0002-Row-filter-for-logical-replication.patchDownload
From 9b1d46d5c146d591022c12c7acec8d61d4e0bfcc Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH 2/6] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 38 +++-
doc/src/sgml/ref/create_subscription.sgml | 8 +-
src/backend/catalog/pg_publication.c | 52 ++++-
src/backend/commands/publicationcmds.c | 96 +++++----
src/backend/parser/gram.y | 28 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 93 ++++++++-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 212 ++++++++++++++++++--
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 32 +++
src/test/regress/sql/publication.sql | 23 +++
src/test/subscription/t/020_row_filter.pl | 190 ++++++++++++++++++
22 files changed, 776 insertions(+), 85 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index db29905e91..7466965ad3 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5495,6 +5495,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
are simple references.
</para></entry>
</row>
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..715c37f2bb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression.
</para>
<para>
@@ -131,9 +135,15 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
+ </para>
+
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise, the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
<para>
@@ -182,6 +192,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +215,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +232,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..b8f4ea5603 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,13 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 84d2efcfd2..7258f7c9a6 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,11 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +146,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +168,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +179,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +211,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +232,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..a5eccbbfb5 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -372,6 +372,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * Although ALTER PUBLICATION grammar allows WHERE clause to be specified
+ * for DROP TABLE action, it doesn't make sense to allow it. We implement
+ * this restriction here, instead of complicating the grammar to enforce
+ * it.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause when removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -389,27 +411,22 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationQual *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
+ /*
+ * Remove all publication-table mappings. We could possibly
+ * remove (i) tables that are not found in the new table list and
+ * (ii) tables that are being re-added with a different qual
+ * expression. For (ii), simply updating the existing tuple is not
+ * enough, because of qual expression dependencies.
+ */
+ oldrel = palloc(sizeof(PublicationRelationQual));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -509,13 +526,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationQual *prq;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -538,8 +557,11 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = myrelid;
+ prq->relation = rel;
+ prq->whereClause = t->whereClause;
+ rels = lappend(rels, prq);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +594,12 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ prq = palloc(sizeof(PublicationRelationQual));
+ prq->relid = childrelid;
+ prq->relation = rel;
+ /* child inherits WHERE clause from parent */
+ prq->whereClause = t->whereClause;
+ rels = lappend(rels, prq);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +620,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(prq->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +641,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(prq->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(prq->relation->rd_rel->relkind),
+ RelationGetRelationName(prq->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, prq, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +673,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationQual *prq = (PublicationRelationQual *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(prq->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +686,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(prq->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ecfd98ba5b..54d8f26a64 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -416,12 +416,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -3806,7 +3806,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace OptWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -9499,7 +9499,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9530,7 +9530,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9538,7 +9538,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9546,7 +9546,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9556,6 +9556,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index fd08b9eeff..06b184f404 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -940,6 +947,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f869e159d6..8eceac241e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -508,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1765,6 +1773,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3048,6 +3059,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 37cebc7d82..bf909d6ed5 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2530,6 +2530,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index feb634e7ac..02ac567a7b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,25 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
+
+ memset(lrel, 0, sizeof(LogicalRepRelation));
lrel->nspname = nspname;
lrel->relname = relname;
@@ -789,6 +795,51 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
walrcv_clear_result(res);
pfree(cmd.data);
}
@@ -803,6 +854,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -811,7 +863,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -820,16 +872,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && list_length(qual) == 0)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -838,9 +897,31 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 18d05286b6..cfb69aeaa6 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,8 +340,8 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+EState *
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -350,8 +350,8 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
@@ -1168,7 +1168,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1293,7 +1293,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1450,7 +1450,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1b993fb032..ce6da8de19 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,12 +15,22 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+#include "catalog/pg_type.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
@@ -57,6 +67,10 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static inline bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -98,6 +112,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -121,7 +136,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -491,6 +506,130 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static inline bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (list_length(rowfilter) == 0)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ if (message_level_is_interesting(DEBUG3))
+ {
+ char *s = NULL;
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(rfnode)), ObjectIdGetDatum(relation->rd_id)));
+ if (result)
+ elog(DEBUG3, "row filter \"%s\" matched", s);
+ else
+ elog(DEBUG3, "row filter \"%s\" not matched", s);
+ pfree(s);
+ }
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -518,7 +657,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -562,6 +701,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -588,6 +731,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
@@ -610,6 +757,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -657,12 +808,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -672,10 +822,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -944,16 +1094,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -969,6 +1124,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -1000,6 +1156,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1059,9 +1218,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1167,6 +1346,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1176,6 +1356,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1193,5 +1375,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1b31fee9e3..fa29468b18 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationQual
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -110,7 +117,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationQual *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index aecf53b3b3..e2becb12eb 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, on pg_publication_rel using btree(oid oid_ops));
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 40ae489c23..cabda6f3c4 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -482,6 +482,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 236832a2ca..65f7a8884f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3543,12 +3543,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3561,7 +3568,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 176b9f37c1..edb80fbe5d 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -79,6 +79,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 3f0b3deefb..a59ad2c9c8 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..c8cf1b685e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,38 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: cannot use a WHERE clause when removing table from publication "testpub5"
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..b8c059d44b
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,190 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 6;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
0003-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; name=0003-Print-publication-WHERE-condition-in-psql.patchDownload
From 44ccb34fb666efc6a1ee48c767185de1d13472a3 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:09:19 -0300
Subject: [PATCH 3/6] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 20af5a92b4..8a7113ce15 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6011,7 +6011,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -6041,6 +6042,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.20.1
0004-Publication-WHERE-condition-support-for-pg_dump.patchtext/x-patch; name=0004-Publication-WHERE-condition-support-for-pg_dump.patchDownload
From c10c8accbf31d4f9b2e874472fdd2a768af3fcc9 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 14:29:59 -0300
Subject: [PATCH 4/6] Publication WHERE condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 14 ++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index eb988d7eb4..7b22d64b7c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4080,6 +4080,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4091,7 +4092,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Collect all publication membership info. */
appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
"FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -4101,6 +4103,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4141,6 +4144,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4173,8 +4180,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 0a2213fb06..b3ee37f395 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -625,6 +625,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.20.1
0005-Measure-row-filter-overhead.patchtext/x-patch; name=0005-Measure-row-filter-overhead.patchDownload
From 3ddbe8db82efb198825e1889ff4cb0b9afb056b4 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sun, 31 Jan 2021 20:48:43 -0300
Subject: [PATCH 5/6] Measure row filter overhead
---
src/backend/replication/pgoutput/pgoutput.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce6da8de19..e603a15ffd 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -570,6 +570,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+ instr_time start_time;
+ instr_time end_time;
/* Bail out if there is no row filter */
if (list_length(rowfilter) == 0)
@@ -579,6 +581,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
+ INSTR_TIME_SET_CURRENT(start_time);
+
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
@@ -627,6 +631,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
ResetExprContext(ecxt);
FreeExecutorState(estate);
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+
+ elog(DEBUG2, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+
return result;
}
--
2.20.1
0006-Debug-messages.patchtext/x-patch; name=0006-Debug-messages.patchDownload
From e97203ac7340ed24fffa8e779fff0f5a35d92ba0 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Fri, 26 Feb 2021 21:08:10 -0300
Subject: [PATCH 6/6] Debug messages
---
src/backend/parser/parse_expr.c | 6 ++++++
src/test/subscription/t/020_row_filter.pl | 1 +
2 files changed, 7 insertions(+)
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 8eceac241e..52024d78d1 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -116,6 +116,12 @@ transformExprRecurse(ParseState *pstate, Node *expr)
if (expr == NULL)
return NULL;
+ /*
+ * T_FuncCall: 349
+ * EXPR_KIND_PUBLICATION_WHERE: 42
+ */
+ elog(DEBUG3, "nodeTag(expr): %d ; pstate->p_expr_kind: %d", nodeTag(expr), pstate->p_expr_kind);
+
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
index b8c059d44b..41775c554a 100644
--- a/src/test/subscription/t/020_row_filter.pl
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -8,6 +8,7 @@ use Test::More tests => 6;
# create publisher node
my $node_publisher = get_new_node('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', 'log_min_messages = DEBUG3');
$node_publisher->start;
# create subscriber node
--
2.20.1
Hi Euler,
Please find some comments below:
1. If the where clause contains non-replica identity columns, the delete
performed on a replicated row
using DELETE from pub_tab where repl_ident_col = n;
is not being replicated, as logical replication does not have any info
whether the column has
to be filtered or not.
Shouldn't a warning be thrown in this case to notify the user that the
delete is not replicated.
2. Same for update, even if I update a row to match the quals on publisher,
it is still not being replicated to
the subscriber. (if the quals contain non-replica identity columns). I
think for UPDATE at least, the new value
of the non-replicate identity column is available which can be used to
filter and replicate the update.
3. 0001.patch,
Why is the name of the existing ExclusionWhereClause node being changed, if
the exact same definition is being used?
For 0002.patch,
4. +
+ memset(lrel, 0, sizeof(LogicalRepRelation));
Is this needed, apart from the above, patch does not use or update lrel at
all in that function.
5. PublicationRelationQual and PublicationTable have similar fields, can
PublicationTable
be used in place of PublicationRelationQual instead of defining a new
struct?
Thank you,
Rahila Syed
Hi Euler,
Please find below some review comments,
1.
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</entry>
+ </row>
I think the docs are being incorrectly updated to add a column to
pg_partitioned_table
instead of pg_publication_rel.
2. +typedef struct PublicationRelationQual
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;
Can this be given a more generic name like PublicationRelationInfo, so that
the same struct
can be used to store additional relation information in future, for ex.
column names, if column filtering is introduced.
3. Also, in the above structure, it seems that we can do with storing just
relid and derive relation information from it
using table_open when needed. Am I missing something?
4. Currently in logical replication, I noticed that an UPDATE is being
applied on the subscriber even if the column values
are unchanged. Can row-filtering feature be used to change it such that,
when all the OLD.columns = NEW.columns, filter out
the row from being sent to the subscriber. I understand this would need
REPLICA IDENTITY FULL to work, but would be an
improvement from the existing state.
On subscriber:
postgres=# select xmin, * from tab_rowfilter_1;
xmin | a | b
------+---+-------------
555 | 1 | unfiltered
(1 row)
On publisher:
postgres=# ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;
ALTER TABLE
postgres=# update tab_rowfilter_1 SET b = 'unfiltered' where a = 1;
UPDATE 1
On Subscriber: The xmin has changed indicating the update from the
publisher was applied
even though nothing changed.
postgres=# select xmin, * from tab_rowfilter_1;
xmin | a | b
------+---+-------------
556 | 1 | unfiltered
(1 row)
5. Currently, any existing rows that were not replicated, when updated to
match the publication quals
using UPDATE tab SET pub_qual_column = 'not_filtered' where a = 1; won't be
applied, as row
does not exist on the subscriber. It would be good if ALTER SUBSCRIBER
REFRESH PUBLICATION
would help fetch such existing rows from publishers that match the qual
now(either because the row changed
or the qual changed)
Thank you,
Rahila Syed
On Tue, Mar 9, 2021 at 8:35 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
Show quoted text
Hi Euler,
Please find some comments below:
1. If the where clause contains non-replica identity columns, the delete
performed on a replicated row
using DELETE from pub_tab where repl_ident_col = n;
is not being replicated, as logical replication does not have any info
whether the column has
to be filtered or not.
Shouldn't a warning be thrown in this case to notify the user that the
delete is not replicated.2. Same for update, even if I update a row to match the quals on
publisher, it is still not being replicated to
the subscriber. (if the quals contain non-replica identity columns). I
think for UPDATE at least, the new value
of the non-replicate identity column is available which can be used to
filter and replicate the update.3. 0001.patch,
Why is the name of the existing ExclusionWhereClause node being changed,
if the exact same definition is being used?For 0002.patch,
4. +
+ memset(lrel, 0, sizeof(LogicalRepRelation));Is this needed, apart from the above, patch does not use or update lrel at
all in that function.5. PublicationRelationQual and PublicationTable have similar fields, can
PublicationTable
be used in place of PublicationRelationQual instead of defining a new
struct?Thank you,
Rahila Syed
On Tue, Mar 9, 2021, at 12:05 PM, Rahila Syed wrote:
Please find some comments below:
Thanks for your review.
1. If the where clause contains non-replica identity columns, the delete performed on a replicated row
using DELETE from pub_tab where repl_ident_col = n;
is not being replicated, as logical replication does not have any info whether the column has
to be filtered or not.
Shouldn't a warning be thrown in this case to notify the user that the delete is not replicated.
Isn't documentation enough? If you add a WARNING, it should be printed per row,
hence, a huge DELETE will flood the client with WARNING messages by default. If
you are thinking about LOG messages, it is a different story. However, we
should limit those messages to one per transaction. Even if we add such an aid,
it would impose a performance penalty while checking the DELETE is not
replicating because the row filter contains a column that is not part of the PK
or REPLICA IDENTITY. If I were to add any message, it would be to warn at the
creation time (CREATE PUBLICATION or ALTER PUBLICATION ... [ADD|SET] TABLE).
2. Same for update, even if I update a row to match the quals on publisher, it is still not being replicated to
the subscriber. (if the quals contain non-replica identity columns). I think for UPDATE at least, the new value
of the non-replicate identity column is available which can be used to filter and replicate the update.
Indeed, the row filter for UPDATE uses the new tuple. Maybe your non-replica
identity column contains NULL that evaluates the expression to false.
3. 0001.patch,
Why is the name of the existing ExclusionWhereClause node being changed, if the exact same definition is being used?
Because this node ExclusionWhereClause is used for exclusion constraint. This
patch renames the node to made it clear it is a generic node that could be used
for other filtering features in the future.
For 0002.patch,
4. +
+ memset(lrel, 0, sizeof(LogicalRepRelation));Is this needed, apart from the above, patch does not use or update lrel at all in that function.
Good catch. It is a leftover from a previous patch. It will be fixed in the
next patch set.
5. PublicationRelationQual and PublicationTable have similar fields, can PublicationTable
be used in place of PublicationRelationQual instead of defining a new struct?
I don't think it is a good idea to have additional fields in a parse node. The
DDL commands use Relation (PublicationTableQual) and parse code uses RangeVar
(PublicationTable). publicationcmds.c uses Relation everywhere so I decided to
create a new struct to store Relation and qual as a list item. It also minimizes the places
you have to modify.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Thu, Mar 18, 2021, at 7:51 AM, Rahila Syed wrote:
1.
I think the docs are being incorrectly updated to add a column to pg_partitioned_table
instead of pg_publication_rel.
Good catch.
2. +typedef struct PublicationRelationQual +{ + Oid relid; + Relation relation; + Node *whereClause; +} PublicationRelationQual;Can this be given a more generic name like PublicationRelationInfo, so that the same struct
can be used to store additional relation information in future, for ex. column names, if column filtering is introduced.
Good idea. I rename it and it'll be in this next patch set.
3. Also, in the above structure, it seems that we can do with storing just relid and derive relation information from it
using table_open when needed. Am I missing something?
We need the Relation. See OpenTableList(). The way this code is organized, it
opens all publication tables and append each Relation to a list. This list is
used in PublicationAddTables() to update the catalog. I tried to minimize the
number of refactors while introducing this feature. We could probably revise
this code in the future (someone said in a previous discussion that it is weird
to open relations in one source code file -- publicationcmds.c -- and use it
into another one -- pg_publication.c).
4. Currently in logical replication, I noticed that an UPDATE is being applied on the subscriber even if the column values
are unchanged. Can row-filtering feature be used to change it such that, when all the OLD.columns = NEW.columns, filter out
the row from being sent to the subscriber. I understand this would need REPLICA IDENTITY FULL to work, but would be an
improvement from the existing state.
This is how Postgres works.
postgres=# create table foo (a integer, b integer);
CREATE TABLE
postgres=# insert into foo values(1, 100);
INSERT 0 1
postgres=# select ctid, xmin, xmax, a, b from foo;
ctid | xmin | xmax | a | b
-------+--------+------+---+-----
(0,1) | 488920 | 0 | 1 | 100
(1 row)
postgres=# update foo set b = 101 where a = 1;
UPDATE 1
postgres=# select ctid, xmin, xmax, a, b from foo;
ctid | xmin | xmax | a | b
-------+--------+------+---+-----
(0,2) | 488921 | 0 | 1 | 101
(1 row)
postgres=# update foo set b = 101 where a = 1;
UPDATE 1
postgres=# select ctid, xmin, xmax, a, b from foo;
ctid | xmin | xmax | a | b
-------+--------+------+---+-----
(0,3) | 488922 | 0 | 1 | 101
(1 row)
You could probably abuse this feature and skip some UPDATEs when old tuple is
identical to new tuple. The question is: why would someone issue the same
command multiple times? A broken application? I would say: don't do it. Besides
that, this feature could impose an overhead into a code path that already
consume substantial CPU time. I've seen some tables with RIF and dozens of
columns that would certainly contribute to increase the replication lag.
5. Currently, any existing rows that were not replicated, when updated to match the publication quals
using UPDATE tab SET pub_qual_column = 'not_filtered' where a = 1; won't be applied, as row
does not exist on the subscriber. It would be good if ALTER SUBSCRIBER REFRESH PUBLICATION
would help fetch such existing rows from publishers that match the qual now(either because the row changed
or the qual changed)
I see. This should be addressed by a resynchronize feature. Such option is
useful when you have to change the row filter. It should certainly be implement
as an ALTER SUBSCRIPTION subcommand.
I attached a new patch set that addresses:
* fix documentation;
* rename PublicationRelationQual to PublicationRelationInfo;
* remove the memset that was leftover from a previous patch set;
* add new tests to improve coverage (INSERT/UPDATE/DELETE to exercise the row
filter code).
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v12-0001-Rename-a-WHERE-node.patchtext/x-patch; name=v12-0001-Rename-a-WHERE-node.patchDownload
From a6d893be0091bc8cd8569ac380e6f628263d31c0 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 11:53:34 -0300
Subject: [PATCH v12 1/5] Rename a WHERE node
A WHERE clause will be used for row filtering in logical replication. We
already have a similar node: 'WHERE (condition here)'. Let's rename the
node to a generic name and use it for row filtering too.
---
src/backend/parser/gram.y | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bc43641ffe..22bbb27041 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -495,7 +495,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> def_arg columnElem where_clause where_or_current_clause
a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound
columnref in_expr having_clause func_table xmltable array_expr
- ExclusionWhereClause operator_def_arg
+ OptWhereClause operator_def_arg
%type <list> rowsfrom_item rowsfrom_list opt_col_def_list
%type <boolean> opt_ordinality
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -3837,7 +3837,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace ExclusionWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -3939,7 +3939,7 @@ ExclusionConstraintElem: index_elem WITH any_operator
}
;
-ExclusionWhereClause:
+OptWhereClause:
WHERE '(' a_expr ')' { $$ = $3; }
| /*EMPTY*/ { $$ = NULL; }
;
--
2.20.1
v12-0002-Row-filter-for-logical-replication.patchtext/x-patch; name=v12-0002-Row-filter-for-logical-replication.patchDownload
From 2b50e8ef4ce7362426f6fd305fc3ab95a6d7e448 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH v12 2/5] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 38 +++-
doc/src/sgml/ref/create_subscription.sgml | 8 +-
src/backend/catalog/pg_publication.c | 52 ++++-
src/backend/commands/publicationcmds.c | 98 +++++----
src/backend/parser/gram.y | 28 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 91 +++++++-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 212 +++++++++++++++++--
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 32 +++
src/test/regress/sql/publication.sql | 23 ++
src/test/subscription/t/020_row_filter.pl | 221 ++++++++++++++++++++
22 files changed, 805 insertions(+), 87 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 68d1960698..83c8d33186 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6203,6 +6203,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..715c37f2bb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression.
</para>
<para>
@@ -131,9 +135,15 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
+ </para>
+
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise, the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
<para>
@@ -182,6 +192,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +215,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +232,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..b8f4ea5603 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,13 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 84d2efcfd2..abc3fb6411 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,11 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
+
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +146,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +168,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +179,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +211,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +232,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..e352c66d9c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -372,6 +372,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * Although ALTER PUBLICATION grammar allows WHERE clause to be specified
+ * for DROP TABLE action, it doesn't make sense to allow it. We implement
+ * this restriction here, instead of complicating the grammar to enforce
+ * it.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause when removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -385,31 +407,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationInfo *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -509,13 +524,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -538,8 +555,11 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +592,12 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +618,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +639,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +671,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +684,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 22bbb27041..2a931efb41 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -425,13 +425,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -3837,7 +3837,7 @@ ConstraintElem:
$$ = (Node *)n;
}
| EXCLUDE access_method_clause '(' ExclusionConstraintList ')'
- opt_c_include opt_definition OptConsTableSpace OptWhereClause
+ opt_c_include opt_definition OptConsTableSpace OptWhereClause
ConstraintAttributeSpec
{
Constraint *n = makeNode(Constraint);
@@ -9530,7 +9530,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9561,7 +9561,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9569,7 +9569,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9577,7 +9577,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9587,6 +9587,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7c3e01aa22..544ff55a63 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -544,6 +544,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -940,6 +947,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f869e159d6..8eceac241e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -508,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1765,6 +1773,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3048,6 +3059,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 37cebc7d82..bf909d6ed5 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2530,6 +2530,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 6ed31812ab..d47e04f35d 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -789,6 +793,51 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
+ walrcv_clear_result(res);
+
+ /* Get relation qual */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
walrcv_clear_result(res);
pfree(cmd.data);
}
@@ -803,6 +852,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -811,7 +861,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -820,16 +870,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && list_length(qual) == 0)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -838,9 +895,31 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (list_length(qual) > 0)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 21d304a64c..004d0d46c3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,8 +340,8 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+EState *
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -350,8 +350,8 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
@@ -1168,7 +1168,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1293,7 +1293,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1450,7 +1450,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1b993fb032..ce6da8de19 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,12 +15,22 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
+#include "catalog/pg_type.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/execnodes.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
@@ -57,6 +67,10 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static inline bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -98,6 +112,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -121,7 +136,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -491,6 +506,130 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static inline bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (list_length(rowfilter) == 0)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ if (message_level_is_interesting(DEBUG3))
+ {
+ char *s = NULL;
+
+ s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(rfnode)), ObjectIdGetDatum(relation->rd_id)));
+ if (result)
+ elog(DEBUG3, "row filter \"%s\" matched", s);
+ else
+ elog(DEBUG3, "row filter \"%s\" not matched", s);
+ pfree(s);
+ }
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -518,7 +657,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -562,6 +701,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -588,6 +731,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
@@ -610,6 +757,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -657,12 +808,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -672,10 +822,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -944,16 +1094,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -969,6 +1124,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -1000,6 +1156,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1059,9 +1218,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1167,6 +1346,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1176,6 +1356,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1193,5 +1375,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (list_length(entry->qual) > 0)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1b31fee9e3..9a60211259 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -110,7 +117,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index aecf53b3b3..e2becb12eb 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, on pg_publication_rel using btree(oid oid_ops));
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index e22df890ef..1c9f4fe765 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -485,6 +485,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 68425eb2c0..7570c3bc19 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3556,12 +3556,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3574,7 +3581,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 176b9f37c1..edb80fbe5d 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -79,6 +79,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 3f0b3deefb..a59ad2c9c8 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..c8cf1b685e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,38 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: cannot use a WHERE clause when removing table from publication "testpub5"
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..35a41741d3
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,221 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+# UPDATE is not replicated ; row filter evaluates to false when b = NULL
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+# DELETE is not replicated ; b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
v12-0003-Print-publication-WHERE-condition-in-psql.patchtext/x-patch; name=v12-0003-Print-publication-WHERE-condition-in-psql.patchDownload
From c0c1b69b4924d64ce5df4bc7c8a229cc73a8b6c5 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:09:19 -0300
Subject: [PATCH v12 3/5] Print publication WHERE condition in psql
---
src/bin/psql/describe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index eeac0efc4f..983ba512f7 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6038,7 +6038,8 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
+ "SELECT n.nspname, c.relname,\n"
+ " pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
@@ -6068,6 +6069,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE %s",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
--
2.20.1
v12-0004-Publication-WHERE-condition-support-for-pg_dump.patchtext/x-patch; name=v12-0004-Publication-WHERE-condition-support-for-pg_dump.patchDownload
From 8c11843fa47126d37121fc16a52ffb82d9101443 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 14:29:59 -0300
Subject: [PATCH v12 4/5] Publication WHERE condition support for pg_dump
---
src/bin/pg_dump/pg_dump.c | 14 ++++++++++++--
src/bin/pg_dump/pg_dump.h | 1 +
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da6cc054b0..72d87f21c8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4130,6 +4130,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4141,7 +4142,8 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
/* Collect all publication membership info. */
appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
"FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -4151,6 +4153,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4191,6 +4194,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4223,8 +4230,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 5340843081..155fc2ebb5 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
--
2.20.1
v12-0005-Measure-row-filter-overhead.patchtext/x-patch; name=v12-0005-Measure-row-filter-overhead.patchDownload
From 2a1d560036b777c7823805052c094704f51b4324 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sun, 31 Jan 2021 20:48:43 -0300
Subject: [PATCH v12 5/5] Measure row filter overhead
---
src/backend/replication/pgoutput/pgoutput.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce6da8de19..e603a15ffd 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -570,6 +570,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+ instr_time start_time;
+ instr_time end_time;
/* Bail out if there is no row filter */
if (list_length(rowfilter) == 0)
@@ -579,6 +581,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
+ INSTR_TIME_SET_CURRENT(start_time);
+
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
@@ -627,6 +631,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
ResetExprContext(ecxt);
FreeExecutorState(estate);
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+
+ elog(DEBUG2, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+
return result;
}
--
2.20.1
On 22.03.21 03:15, Euler Taveira wrote:
I attached a new patch set that addresses:
* fix documentation;
* rename PublicationRelationQual to PublicationRelationInfo;
* remove the memset that was leftover from a previous patch set;
* add new tests to improve coverage (INSERT/UPDATE/DELETE to exercise
the row
filter code).
I have committed the 0001 patch.
Attached are a few fixup patches that I recommend you integrate into
your patch set. They address backward compatibility with PG13, and a
few more stylistic issues.
I suggest you combine your 0002, 0003, and 0004 patches into one. They
can't be used separately, and for example the psql changes in patch 0003
already appear as regression test output changes in 0002, so this
arrangement isn't useful. (0005 can be kept separately, since it's
mostly for debugging right now.)
Attachments:
0001-fixup-Row-filter-for-logical-replication.patchtext/plain; charset=UTF-8; name=0001-fixup-Row-filter-for-logical-replication.patch; x-mac-creator=0; x-mac-type=0Download
From c29459505db88c86dd7aa61019fa406202c30b0a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Mar 2021 11:57:48 +0100
Subject: [PATCH 1/6] fixup! Row filter for logical replication
Remove unused header files, clean up whitespace.
---
src/backend/catalog/pg_publication.c | 2 --
src/backend/replication/pgoutput/pgoutput.c | 4 ----
2 files changed, 6 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index f31ae28de2..78f5780fb7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,11 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
-
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
-
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce6da8de19..6151f34925 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -16,14 +16,10 @@
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
-#include "catalog/pg_type.h"
#include "commands/defrem.h"
#include "executor/executor.h"
#include "fmgr.h"
-#include "nodes/execnodes.h"
#include "nodes/nodeFuncs.h"
-#include "optimizer/planner.h"
-#include "optimizer/optimizer.h"
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
--
2.30.2
0002-fixup-Row-filter-for-logical-replication.patchtext/plain; charset=UTF-8; name=0002-fixup-Row-filter-for-logical-replication.patch; x-mac-creator=0; x-mac-type=0Download
From 91c20ecb45a8627a9252ed589fe6b85927be8b45 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Mar 2021 11:59:13 +0100
Subject: [PATCH 2/6] fixup! Row filter for logical replication
Allow replication from older PostgreSQL versions without prqual.
---
src/backend/replication/logical/tablesync.c | 70 +++++++++++----------
1 file changed, 37 insertions(+), 33 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 40d84dadb5..246510c82e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -796,49 +796,53 @@ fetch_remote_table_info(char *nspname, char *relname,
walrcv_clear_result(res);
/* Get relation qual */
- resetStringInfo(&cmd);
- appendStringInfo(&cmd,
- "SELECT pg_get_expr(prqual, prrelid) "
- " FROM pg_publication p "
- " INNER JOIN pg_publication_rel pr "
- " ON (p.oid = pr.prpubid) "
- " WHERE pr.prrelid = %u "
- " AND p.pubname IN (", lrel->remoteid);
-
- first = true;
- foreach(lc, MySubscription->publications)
+ if (walrcv_server_version(wrconn) >= 140000)
{
- char *pubname = strVal(lfirst(lc));
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
- if (first)
- first = false;
- else
- appendStringInfoString(&cmd, ", ");
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
- appendStringInfoString(&cmd, quote_literal_cstr(pubname));
- }
- appendStringInfoChar(&cmd, ')');
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
- res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
- if (res->status != WALRCV_OK_TUPLES)
- ereport(ERROR,
- (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
- nspname, relname, res->err)));
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
- slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
- while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
- {
- Datum rf = slot_getattr(slot, 1, &isnull);
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
- if (!isnull)
- *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
- ExecClearTuple(slot);
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
}
- ExecDropSingleTupleTableSlot(slot);
- walrcv_clear_result(res);
pfree(cmd.data);
}
--
2.30.2
0003-fixup-Row-filter-for-logical-replication.patchtext/plain; charset=UTF-8; name=0003-fixup-Row-filter-for-logical-replication.patch; x-mac-creator=0; x-mac-type=0Download
From d40b463def29cdded18579d6e584cb9f5f4764d6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Mar 2021 12:00:35 +0100
Subject: [PATCH 3/6] fixup! Row filter for logical replication
Use more idiomatic style for checking for empty or nonempty lists.
---
src/backend/replication/logical/tablesync.c | 4 ++--
src/backend/replication/pgoutput/pgoutput.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 246510c82e..c3d6847b35 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -881,7 +881,7 @@ copy_table(Relation rel)
initStringInfo(&cmd);
/* Regular table with no row filter */
- if (lrel.relkind == RELKIND_RELATION && list_length(qual) == 0)
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
@@ -902,7 +902,7 @@ copy_table(Relation rel)
appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
/* list of AND'ed filters */
- if (list_length(qual) > 0)
+ if (qual != NIL)
{
ListCell *lc;
bool first = true;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6151f34925..593a8c96c8 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -568,7 +568,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
bool result = true;
/* Bail out if there is no row filter */
- if (list_length(rowfilter) == 0)
+ if (rowfilter == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
@@ -1372,7 +1372,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
- if (list_length(entry->qual) > 0)
+ if (entry->qual != NIL)
list_free_deep(entry->qual);
entry->qual = NIL;
}
--
2.30.2
0004-fixup-Row-filter-for-logical-replication.patchtext/plain; charset=UTF-8; name=0004-fixup-Row-filter-for-logical-replication.patch; x-mac-creator=0; x-mac-type=0Download
From b82e7b05bb3fd73bede39f480c31d657fe410e41 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Mar 2021 12:02:48 +0100
Subject: [PATCH 4/6] fixup! Row filter for logical replication
elog arguments are not evaluated unless the message level is
interesting, so the previous workaround is unnecessary.
---
src/backend/replication/pgoutput/pgoutput.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 593a8c96c8..00ad8f001f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -603,17 +603,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
/* Evaluates row filter */
result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
- if (message_level_is_interesting(DEBUG3))
- {
- char *s = NULL;
-
- s = TextDatumGetCString(DirectFunctionCall2(pg_get_expr, CStringGetTextDatum(nodeToString(rfnode)), ObjectIdGetDatum(relation->rd_id)));
- if (result)
- elog(DEBUG3, "row filter \"%s\" matched", s);
- else
- elog(DEBUG3, "row filter \"%s\" not matched", s);
- pfree(s);
- }
+ elog(DEBUG3, "row filter \"%s\" %smatched",
+ TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+ CStringGetTextDatum(nodeToString(rfnode)),
+ ObjectIdGetDatum(relation->rd_id))),
+ result ? "" : " not");
/* If the tuple does not match one of the row filters, bail out */
if (!result)
--
2.30.2
0005-fixup-Publication-WHERE-condition-support-for-pg_dum.patchtext/plain; charset=UTF-8; name=0005-fixup-Publication-WHERE-condition-support-for-pg_dum.patch; x-mac-creator=0; x-mac-type=0Download
From af012ad7d7592f88e353642fd715a3825ea2ec72 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Mar 2021 12:07:11 +0100
Subject: [PATCH 5/6] fixup! Publication WHERE condition support for pg_dump
Make pg_dump backward compatible.
Also add necessary parentheses around expression. pg_get_expr will
supply the parentheses in many cases, but it won't for things like
"WHERE TRUE".
---
src/bin/pg_dump/pg_dump.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 72d87f21c8..af57a50f27 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,10 +4141,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid, "
- "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 140000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4233,7 +4239,7 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
if (pubrinfo->pubrelqual)
- appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
appendPQExpBufferStr(query, ";\n");
/*
--
2.30.2
0006-fixup-Print-publication-WHERE-condition-in-psql.patchtext/plain; charset=UTF-8; name=0006-fixup-Print-publication-WHERE-condition-in-psql.patch; x-mac-creator=0; x-mac-type=0Download
From 326c489012f020f8ef38e1187aea72b332d1b9a4 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 25 Mar 2021 12:08:51 +0100
Subject: [PATCH 6/6] fixup! Print publication WHERE condition in psql
Make psql backward compatible. Also add necessary parentheses.
---
src/bin/psql/describe.c | 11 +++++++----
src/test/regress/expected/publication.out | 2 +-
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 983ba512f7..6dce968414 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6038,9 +6038,12 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname,\n"
- " pg_get_expr(pr.prqual, c.oid)\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 140000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6070,7 +6073,7 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 1));
if (!PQgetisnull(tabres, j, 2))
- appendPQExpBuffer(&buf, " WHERE %s",
+ appendPQExpBuffer(&buf, " WHERE (%s)",
PQgetvalue(tabres, j, 2));
printTableAddFooter(&cont, buf.data);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index c8cf1b685e..da49a2e215 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -181,7 +181,7 @@ ERROR: cannot use a WHERE clause when removing table from publication "testpub5
--------------------------+------------+---------+---------+---------+-----------+----------
regress_publication_user | f | t | t | t | t | f
Tables:
- "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
--
2.30.2
On Thu, Mar 25, 2021, at 8:15 AM, Peter Eisentraut wrote:
I have committed the 0001 patch.
Attached are a few fixup patches that I recommend you integrate into
your patch set. They address backward compatibility with PG13, and a
few more stylistic issues.I suggest you combine your 0002, 0003, and 0004 patches into one. They
can't be used separately, and for example the psql changes in patch 0003
already appear as regression test output changes in 0002, so this
arrangement isn't useful. (0005 can be kept separately, since it's
mostly for debugging right now.)
I appreciate your work on it. I split into psql and pg_dump support just
because it was developed after the main patch. I expect them to be combined
into the main patch (0002) before committing it. This new patch set integrates
them into the main patch.
I totally forgot about the backward compatibility support. Good catch. While
inspecting the code again, I did a small fix into the psql support. I added an
else as shown below so the query always returns the same number of columns and
we don't possibly have an issue while using a column number that is out of
range in PQgetisnull() a few lines later.
if (pset.sversion >= 140000)
appendPQExpBuffer(&buf,
", pg_get_expr(pr.prqual, c.oid)");
else
appendPQExpBuffer(&buf,
", NULL");
While testing the replication between v14 -> v10, I realized that even if the
tables in the publication have row filters, the data synchronization code won't
evaluate the row filter expressions. That's because the subscriber (v10) is
responsible to assemble the COPY command (possibly adding row filters) for data
synchronization and there is no such code in released versions. I added a new
sentence into copy_data parameter saying that row filters won't be used if
version is prior than 14. I also include this info into the commit message.
At this time, I didn't include the patch that changes the log_min_messages in
the row filter regression test. It was part of this patch set for testing
purposes only.
I don't expect the patch that measures row filter performance to be included
but I'm including it again in case someone wants to inspect the performance
numbers.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v13-0001-Row-filter-for-logical-replication.patchtext/x-patch; name=v13-0001-Row-filter-for-logical-replication.patchDownload
From 692572d0daaf8905e750889aa2f374764d86636c Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH v13 1/2] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-14 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 38 +++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 98 +++++----
src/backend/parser/gram.y | 26 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 202 ++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 32 +++
src/test/regress/sql/publication.sql | 23 ++
src/test/subscription/t/020_row_filter.pl | 221 ++++++++++++++++++++
25 files changed, 833 insertions(+), 92 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f103d914a6..4f7ebcd967 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6213,6 +6213,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..715c37f2bb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression.
</para>
<para>
@@ -131,9 +135,15 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
+ </para>
+
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise, the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
<para>
@@ -182,6 +192,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +215,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +232,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..6fcd2fd447 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 14.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415af89..78f5780fb7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +144,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +166,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +177,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +209,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +230,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..e352c66d9c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -372,6 +372,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
+ /*
+ * Although ALTER PUBLICATION grammar allows WHERE clause to be specified
+ * for DROP TABLE action, it doesn't make sense to allow it. We implement
+ * this restriction here, instead of complicating the grammar to enforce
+ * it.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause when removing table from publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
+
rels = OpenTableList(stmt->tables);
if (stmt->tableAction == DEFELEM_ADD)
@@ -385,31 +407,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationInfo *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -509,13 +524,15 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
+ PublicationTable *t = lfirst(lc);
+ RangeVar *rv = castNode(RangeVar, t->relation);
bool recurse = rv->inh;
Relation rel;
Oid myrelid;
@@ -538,8 +555,11 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +592,12 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +618,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +639,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +671,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +684,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7ff36bc842..e3833f5631 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,13 +426,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9577,7 +9577,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9608,7 +9608,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9616,7 +9616,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9624,7 +9624,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE relation_expr_list
+ | ALTER PUBLICATION name DROP TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9634,6 +9634,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index ceb0bf597d..86c16c3def 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 03373d551f..52c46fc7a7 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3054,6 +3065,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index debef1d14f..bec927e1da 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2533,6 +2533,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 8494db8f05..c3d6847b35 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -790,6 +794,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(wrconn) >= 140000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -803,6 +856,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -811,7 +865,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -820,16 +874,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -838,9 +899,31 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 354fbe4b4b..04750f1778 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,8 +340,8 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+EState *
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -350,8 +350,8 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
@@ -1168,7 +1168,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1293,7 +1293,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1450,7 +1450,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1b993fb032..00ad8f001f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,12 +15,18 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
@@ -57,6 +63,10 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static inline bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -98,6 +108,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -121,7 +132,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -491,6 +502,124 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static inline bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (rowfilter == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter \"%s\" %smatched",
+ TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+ CStringGetTextDatum(nodeToString(rfnode)),
+ ObjectIdGetDatum(relation->rd_id))),
+ result ? "" : " not");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -518,7 +647,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -562,6 +691,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -588,6 +721,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
@@ -610,6 +747,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -657,12 +798,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -672,10 +812,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -944,16 +1084,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -969,6 +1114,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -1000,6 +1146,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1059,9 +1208,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1167,6 +1336,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1176,6 +1346,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1193,5 +1365,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da6cc054b0..af57a50f27 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4130,6 +4130,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4140,9 +4141,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 140000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4151,6 +4159,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4191,6 +4200,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4223,8 +4236,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 5340843081..155fc2ebb5 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 440249ff69..35d901295d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6167,8 +6167,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 140000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6197,6 +6204,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1b31fee9e3..9a60211259 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -110,7 +117,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index aecf53b3b3..e2becb12eb 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, on pg_publication_rel using btree(oid oid_ops));
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 299956f329..f242092300 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -486,6 +486,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 12e0e026dc..a0161c50f8 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3575,12 +3575,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3593,7 +3600,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index a71d7e1f74..01bed24717 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 3f0b3deefb..a59ad2c9c8 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..da49a2e215 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,38 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: cannot use a WHERE clause when removing table from publication "testpub5"
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..35a41741d3
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,221 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+# UPDATE is not replicated ; row filter evaluates to false when b = NULL
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+# DELETE is not replicated ; b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
v13-0002-Measure-row-filter-overhead.patchtext/x-patch; name=v13-0002-Measure-row-filter-overhead.patchDownload
From ef685a531a043ace67783c9ffe01e765617cfe15 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sun, 31 Jan 2021 20:48:43 -0300
Subject: [PATCH v13 2/2] Measure row filter overhead
---
src/backend/replication/pgoutput/pgoutput.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 00ad8f001f..a67459aec6 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -566,6 +566,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+ instr_time start_time;
+ instr_time end_time;
/* Bail out if there is no row filter */
if (rowfilter == NIL)
@@ -575,6 +577,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
+ INSTR_TIME_SET_CURRENT(start_time);
+
tupdesc = RelationGetDescr(relation);
estate = create_estate_for_relation(relation);
@@ -617,6 +621,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
ResetExprContext(ecxt);
FreeExecutorState(estate);
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+
+ elog(DEBUG2, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+
return result;
}
--
2.20.1
Hi Euler,
While running some tests on v13 patches, I noticed that, in case the
published table data
already exists on the subscriber database before creating the subscription,
at the time of
CREATE subscription/table synchronization, an error as seen as follows
With the patch:
2021-03-29 14:32:56.265 IST [78467] STATEMENT: CREATE_REPLICATION_SLOT
"pg_16406_sync_16390_6944995860755251708" LOGICAL pgoutput USE_SNAPSHOT
2021-03-29 14:32:56.279 IST [78467] LOG: could not send data to client:
Broken pipe
2021-03-29 14:32:56.279 IST [78467] STATEMENT: COPY (SELECT aid, bid,
abalance, filler FROM public.pgbench_accounts WHERE (aid > 0)) TO STDOUT
2021-03-29 14:32:56.279 IST [78467] FATAL: connection to client lost
2021-03-29 14:32:56.279 IST [78467] STATEMENT: COPY (SELECT aid, bid,
abalance, filler FROM public.pgbench_accounts WHERE (aid > 0)) TO STDOUT
2021-03-29 14:33:01.302 IST [78470] LOG: logical decoding found consistent
point at 0/4E2B8460
2021-03-29 14:33:01.302 IST [78470] DETAIL: There are no running
transactions.
Without the patch:
2021-03-29 15:05:01.581 IST [79029] ERROR: duplicate key value violates
unique constraint "pgbench_branches_pkey"
2021-03-29 15:05:01.581 IST [79029] DETAIL: Key (bid)=(1) already exists.
2021-03-29 15:05:01.581 IST [79029] CONTEXT: COPY pgbench_branches, line 1
2021-03-29 15:05:01.583 IST [78538] LOG: background worker "logical
replication worker" (PID 79029) exited with exit code 1
2021-03-29 15:05:06.593 IST [79031] LOG: logical replication table
synchronization worker for subscription "test_sub2", table
"pgbench_branches" has started
Without the patch the COPY command throws an ERROR, but with the patch, a
similar scenario results in client connection being lost.
I didn't investigate it more, but looks like we should maintain the
existing behaviour when table synchronization fails
due to duplicate data.
Thank you,
Rahila Syed
On Mon, Mar 29, 2021, at 6:45 AM, Rahila Syed wrote:
While running some tests on v13 patches, I noticed that, in case the published table data
already exists on the subscriber database before creating the subscription, at the time of
CREATE subscription/table synchronization, an error as seen as followsWith the patch:
2021-03-29 14:32:56.265 IST [78467] STATEMENT: CREATE_REPLICATION_SLOT "pg_16406_sync_16390_6944995860755251708" LOGICAL pgoutput USE_SNAPSHOT
2021-03-29 14:32:56.279 IST [78467] LOG: could not send data to client: Broken pipe
2021-03-29 14:32:56.279 IST [78467] STATEMENT: COPY (SELECT aid, bid, abalance, filler FROM public.pgbench_accounts WHERE (aid > 0)) TO STDOUT
2021-03-29 14:32:56.279 IST [78467] FATAL: connection to client lost
2021-03-29 14:32:56.279 IST [78467] STATEMENT: COPY (SELECT aid, bid, abalance, filler FROM public.pgbench_accounts WHERE (aid > 0)) TO STDOUT
2021-03-29 14:33:01.302 IST [78470] LOG: logical decoding found consistent point at 0/4E2B8460
2021-03-29 14:33:01.302 IST [78470] DETAIL: There are no running transactions.
Rahila, I tried to reproduce this issue with the attached script but no luck. I always get
Without the patch:
2021-03-29 15:05:01.581 IST [79029] ERROR: duplicate key value violates unique constraint "pgbench_branches_pkey"
2021-03-29 15:05:01.581 IST [79029] DETAIL: Key (bid)=(1) already exists.
2021-03-29 15:05:01.581 IST [79029] CONTEXT: COPY pgbench_branches, line 1
2021-03-29 15:05:01.583 IST [78538] LOG: background worker "logical replication worker" (PID 79029) exited with exit code 1
2021-03-29 15:05:06.593 IST [79031] LOG: logical replication table synchronization worker for subscription "test_sub2", table "pgbench_branches" has started
... this message. The code that reports this error is from the COPY command.
Row filter modifications has no control over it. It seems somehow your
subscriber close the replication connection causing this issue. Can you
reproduce it consistently? If so, please share your steps.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
Hi,
While running some tests on v13 patches, I noticed that, in case the
published table data
already exists on the subscriber database before creating the
subscription, at the time of
CREATE subscription/table synchronization, an error as seen as followsWith the patch:
2021-03-29 14:32:56.265 IST [78467] STATEMENT: CREATE_REPLICATION_SLOT
"pg_16406_sync_16390_6944995860755251708" LOGICAL pgoutput USE_SNAPSHOT
2021-03-29 14:32:56.279 IST [78467] LOG: could not send data to client:
Broken pipe
2021-03-29 14:32:56.279 IST [78467] STATEMENT: COPY (SELECT aid, bid,
abalance, filler FROM public.pgbench_accounts WHERE (aid > 0)) TO STDOUT
2021-03-29 14:32:56.279 IST [78467] FATAL: connection to client lost
2021-03-29 14:32:56.279 IST [78467] STATEMENT: COPY (SELECT aid, bid,
abalance, filler FROM public.pgbench_accounts WHERE (aid > 0)) TO STDOUT
2021-03-29 14:33:01.302 IST [78470] LOG: logical decoding found
consistent point at 0/4E2B8460
2021-03-29 14:33:01.302 IST [78470] DETAIL: There are no running
transactions.Rahila, I tried to reproduce this issue with the attached script but no
luck. I always getOK, Sorry for confusion. Actually both the errors are happening on
different servers. *Broken pipe* error on publisher and
the following error on subscriber end. And the behaviour is consistent with
or without row filtering.
Without the patch:
2021-03-29 15:05:01.581 IST [79029] ERROR: duplicate key value violates
unique constraint "pgbench_branches_pkey"
2021-03-29 15:05:01.581 IST [79029] DETAIL: Key (bid)=(1) already exists.
2021-03-29 15:05:01.581 IST [79029] CONTEXT: COPY pgbench_branches, line 1
2021-03-29 15:05:01.583 IST [78538] LOG: background worker "logical
replication worker" (PID 79029) exited with exit code 1
2021-03-29 15:05:06.593 IST [79031] LOG: logical replication table
synchronization worker for subscription "test_sub2", table
"pgbench_branches" has started... this message. The code that reports this error is from the COPY
command.
Row filter modifications has no control over it. It seems somehow your
subscriber close the replication connection causing this issue. Can you
reproduce it consistently? If so, please share your steps.Please ignore the report.
Thank you,
Rahila Syed
On Mon, Mar 29, 2021 at 6:47 PM Euler Taveira <euler@eulerto.com> wrote:
Few comments:
==============
1. How can we specify row filters for multiple tables for a
publication? Consider a case as below:
postgres=# CREATE TABLE tab_rowfilter_1 (a int primary key, b text);
CREATE TABLE
postgres=# CREATE TABLE tab_rowfilter_2 (c int primary key);
CREATE TABLE
postgres=# CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1,
tab_rowfilter_2 WHERE (a > 1000 AND b <> 'filtered');
ERROR: column "a" does not exist
LINE 1: ...FOR TABLE tab_rowfilter_1, tab_rowfilter_2 WHERE (a > 1000 A...
^
postgres=# CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1,
tab_rowfilter_2 WHERE (c > 1000);
CREATE PUBLICATION
It gives an error when I tried to specify the columns corresponding to
the first relation but is fine for columns for the second relation.
Then, I tried few more combinations like below but that didn't work.
CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 As t1,
tab_rowfilter_2 As t2 WHERE (t1.a > 1000 AND t1.b <> 'filtered');
Will users be allowed to specify join conditions among columns from
multiple tables?
2.
+ /*
+ * Although ALTER PUBLICATION grammar allows WHERE clause to be specified
+ * for DROP TABLE action, it doesn't make sense to allow it. We implement
+ * this restriction here, instead of complicating the grammar to enforce
+ * it.
+ */
+ if (stmt->tableAction == DEFELEM_DROP)
+ {
+ ListCell *lc;
+
+ foreach(lc, stmt->tables)
+ {
+ PublicationTable *t = lfirst(lc);
+
+ if (t->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot use a WHERE clause when removing table from
publication \"%s\"",
+ NameStr(pubform->pubname))));
+ }
+ }
Is there a reason to deal with this here separately rather than in the
ALTER PUBLICATION grammar?
--
With Regards,
Amit Kapila.
On Tue, Mar 30, 2021, at 8:23 AM, Amit Kapila wrote:
On Mon, Mar 29, 2021 at 6:47 PM Euler Taveira <euler@eulerto.com <mailto:euler%40eulerto.com>> wrote:
Few comments:
==============
1. How can we specify row filters for multiple tables for a
publication? Consider a case as below:
It is not possible. Row filter is a per table option. Isn't it clear from the
synopsis? The current design allows different row filter for tables in the same
publication. It is more flexible than a single row filter for a set of tables
(even if we would support such variant, there are some cases where the
condition should be different because the column names are not the same). You
can easily build a CREATE PUBLICATION command that adds the same row filter
multiple times using a DO block or use a similar approach in your favorite
language.
postgres=# CREATE TABLE tab_rowfilter_1 (a int primary key, b text);
CREATE TABLE
postgres=# CREATE TABLE tab_rowfilter_2 (c int primary key);
CREATE TABLEpostgres=# CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1,
tab_rowfilter_2 WHERE (a > 1000 AND b <> 'filtered');
ERROR: column "a" does not exist
LINE 1: ...FOR TABLE tab_rowfilter_1, tab_rowfilter_2 WHERE (a > 1000 A...^
postgres=# CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1,
tab_rowfilter_2 WHERE (c > 1000);
CREATE PUBLICATIONIt gives an error when I tried to specify the columns corresponding to
the first relation but is fine for columns for the second relation.
Then, I tried few more combinations like below but that didn't work.
CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 As t1,
tab_rowfilter_2 As t2 WHERE (t1.a > 1000 AND t1.b <> 'filtered');Will users be allowed to specify join conditions among columns from
multiple tables?
It seems you are envisioning row filter as a publication property instead of a
publication-relation property. Due to the flexibility that the later approach
provides, I decided to use it because it covers more use cases. Regarding
allowing joins, it could possibly slow down a critical path, no? This code path
is executed by every change. If there are interest in the join support, we
might add it in a future patch.
2. + /* + * Although ALTER PUBLICATION grammar allows WHERE clause to be specified + * for DROP TABLE action, it doesn't make sense to allow it. We implement + * this restriction here, instead of complicating the grammar to enforce + * it. + */ + if (stmt->tableAction == DEFELEM_DROP) + { + ListCell *lc; + + foreach(lc, stmt->tables) + { + PublicationTable *t = lfirst(lc); + + if (t->whereClause) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot use a WHERE clause when removing table from publication \"%s\"", + NameStr(pubform->pubname)))); + } + }Is there a reason to deal with this here separately rather than in the
ALTER PUBLICATION grammar?
Good question. IIRC the issue is that AlterPublicationStmt->tables has a list
element that was a relation_expr_list and was converted to
publication_table_list. If we share 'tables' with relation_expr_list (for ALTER
PUBLICATION ... DROP TABLE) and publication_table_list (for the other ALTER
PUBLICATION ... ADD|SET TABLE), the OpenTableList() has to know what list
element it is dealing with. I think I came to the conclusion that it is less
uglier to avoid changing OpenTableList() and CloseTableList().
[Doing some experimentation...]
Here is a patch that remove the referred code. It uses 2 distinct list
elements: relation_expr_list for ALTER PUBLICATION ... DROP TABLE and
publication_table_list for for ALTER PUBLICATION ... ADD|SET TABLE. A new
parameter was introduced to deal with the different elements of the list
'tables'.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
0001-Row-filter-for-logical-replication.patchtext/x-patch; name=0001-Row-filter-for-logical-replication.patchDownload
From 99d36d0f5cb5f706c73fcdbb05772580f6814fe6 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-14 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 38 +++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 125 +++++++----
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 202 ++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/include/replication/logicalrelation.h | 2 +
src/test/regress/expected/publication.out | 34 +++
src/test/regress/sql/publication.sql | 23 ++
src/test/subscription/t/020_row_filter.pl | 221 ++++++++++++++++++++
25 files changed, 851 insertions(+), 101 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f103d914a6..4f7ebcd967 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6213,6 +6213,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..715c37f2bb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression.
</para>
<para>
@@ -131,9 +135,15 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
+ </para>
+
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise, the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
<para>
@@ -182,6 +192,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +215,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +232,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..6fcd2fd447 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 14.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415af89..78f5780fb7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +144,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +166,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +177,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +209,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +230,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..1a19ed9972 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,8 +48,8 @@
/* Same as MAXNUMMESSAGES in sinvaladt.c */
#define MAX_RELCACHE_INVAL_MSGS 4096
-static List *OpenTableList(List *tables);
-static void CloseTableList(List *rels);
+static List *OpenTableList(List *tables, bool is_drop);
+static void CloseTableList(List *rels, bool is_drop);
static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
AlterPublicationStmt *stmt);
static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
@@ -232,9 +232,9 @@ CreatePublication(CreatePublicationStmt *stmt)
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, false);
PublicationAddTables(puboid, rels, true, NULL);
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
table_close(rel, RowExclusiveLock);
@@ -372,7 +372,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ if (stmt->tableAction == DEFELEM_DROP)
+ rels = OpenTableList(stmt->tables, true);
+ else
+ rels = OpenTableList(stmt->tables, false);
if (stmt->tableAction == DEFELEM_ADD)
PublicationAddTables(pubid, rels, false, stmt);
@@ -385,31 +388,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationInfo *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -421,10 +417,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
*/
PublicationAddTables(pubid, rels, true, stmt);
- CloseTableList(delrels);
+ CloseTableList(delrels, false);
}
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
/*
@@ -500,26 +496,42 @@ RemovePublicationRelById(Oid proid)
/*
* Open relations specified by a RangeVar list.
+ * AlterPublicationStmt->tables has a different list element, hence, is_drop
+ * indicates if it has a RangeVar (true) or PublicationTable (false).
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
static List *
-OpenTableList(List *tables)
+OpenTableList(List *tables, bool is_drop)
{
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (is_drop)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +550,12 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +588,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -587,16 +609,28 @@ OpenTableList(List *tables)
* Close all relations in the list.
*/
static void
-CloseTableList(List *rels)
+CloseTableList(List *rels, bool is_drop)
{
ListCell *lc;
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ if (is_drop)
+ {
+ Relation rel = (Relation) lfirst(lc);
+
+ table_close(rel, NoLock);
+ }
+ else
+ {
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
+ }
}
+
+ if (!is_drop)
+ list_free_deep(rels);
}
/*
@@ -612,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7ff36bc842..a2899702c1 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,13 +426,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9577,7 +9577,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9608,7 +9608,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9616,7 +9616,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9634,6 +9634,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index ceb0bf597d..86c16c3def 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 03373d551f..52c46fc7a7 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3054,6 +3065,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index debef1d14f..bec927e1da 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2533,6 +2533,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 8494db8f05..c3d6847b35 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -790,6 +794,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(wrconn) >= 140000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(wrconn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -803,6 +856,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -811,7 +865,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -820,16 +874,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -838,9 +899,31 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
+
res = walrcv_exec(wrconn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 354fbe4b4b..04750f1778 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -340,8 +340,8 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
*
* This is based on similar code in copy.c
*/
-static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel)
+EState *
+create_estate_for_relation(Relation rel)
{
EState *estate;
RangeTblEntry *rte;
@@ -350,8 +350,8 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
- rte->relid = RelationGetRelid(rel->localrel);
- rte->relkind = rel->localrel->rd_rel->relkind;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
ExecInitRangeTable(estate, list_make1(rte));
@@ -1168,7 +1168,7 @@ apply_handle_insert(StringInfo s)
}
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1293,7 +1293,7 @@ apply_handle_update(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
@@ -1450,7 +1450,7 @@ apply_handle_delete(StringInfo s)
check_relation_updatable(rel);
/* Initialize the executor state. */
- estate = create_estate_for_relation(rel);
+ estate = create_estate_for_relation(rel->localrel);
remoteslot = ExecInitExtraTupleSlot(estate,
RelationGetDescr(rel->localrel),
&TTSOpsVirtual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1b993fb032..00ad8f001f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,12 +15,18 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
@@ -57,6 +63,10 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static inline bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -98,6 +108,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -121,7 +132,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -491,6 +502,124 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static inline bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (rowfilter == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter \"%s\" %smatched",
+ TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+ CStringGetTextDatum(nodeToString(rfnode)),
+ ObjectIdGetDatum(relation->rd_id))),
+ result ? "" : " not");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -518,7 +647,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -562,6 +691,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -588,6 +721,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
@@ -610,6 +747,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -657,12 +798,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -672,10 +812,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -944,16 +1084,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -969,6 +1114,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
}
@@ -1000,6 +1146,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1059,9 +1208,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1167,6 +1336,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1176,6 +1346,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1193,5 +1365,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 25717ce0e6..4b2316b01e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4220,6 +4220,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4230,9 +4231,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 140000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4241,6 +4249,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4281,6 +4290,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4313,8 +4326,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 5340843081..155fc2ebb5 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 440249ff69..35d901295d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6167,8 +6167,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 140000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6197,6 +6204,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1b31fee9e3..9a60211259 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -110,7 +117,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index aecf53b3b3..e2becb12eb 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, on pg_publication_rel using btree(oid oid_ops));
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 299956f329..f242092300 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -486,6 +486,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 334262b1dd..f34e92b2b2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3575,12 +3575,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3593,7 +3600,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index a71d7e1f74..01bed24717 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 3f0b3deefb..a59ad2c9c8 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -49,4 +49,6 @@ extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
extern void logicalrep_typmap_update(LogicalRepTyp *remotetyp);
extern char *logicalrep_typmap_gettypname(Oid remoteid);
+extern EState *create_estate_for_relation(Relation rel);
+
#endif /* LOGICALRELATION_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..96d869dd27 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,40 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..35a41741d3
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,221 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+# UPDATE is not replicated ; row filter evaluates to false when b = NULL
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+# DELETE is not replicated ; b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
On Wed, Mar 31, 2021 at 7:17 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Mar 30, 2021, at 8:23 AM, Amit Kapila wrote:
On Mon, Mar 29, 2021 at 6:47 PM Euler Taveira <euler@eulerto.com> wrote:
Few comments:
==============
1. How can we specify row filters for multiple tables for a
publication? Consider a case as below:It is not possible. Row filter is a per table option. Isn't it clear from the
synopsis?
Sorry, it seems I didn't read it properly earlier, now I got it.
2. + /* + * Although ALTER PUBLICATION grammar allows WHERE clause to be specified + * for DROP TABLE action, it doesn't make sense to allow it. We implement + * this restriction here, instead of complicating the grammar to enforce + * it. + */ + if (stmt->tableAction == DEFELEM_DROP) + { + ListCell *lc; + + foreach(lc, stmt->tables) + { + PublicationTable *t = lfirst(lc); + + if (t->whereClause) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot use a WHERE clause when removing table from publication \"%s\"", + NameStr(pubform->pubname)))); + } + }Is there a reason to deal with this here separately rather than in the
ALTER PUBLICATION grammar?Good question. IIRC the issue is that AlterPublicationStmt->tables has a list
element that was a relation_expr_list and was converted to
publication_table_list. If we share 'tables' with relation_expr_list (for ALTER
PUBLICATION ... DROP TABLE) and publication_table_list (for the other ALTER
PUBLICATION ... ADD|SET TABLE), the OpenTableList() has to know what list
element it is dealing with. I think I came to the conclusion that it is less
uglier to avoid changing OpenTableList() and CloseTableList().[Doing some experimentation...]
Here is a patch that remove the referred code.
Thanks, few more comments:
1. In pgoutput_change, we are always sending schema even though we
don't send actual data because of row filters. It may not be a problem
in many cases but I guess for some odd cases we can avoid sending
extra information.
2. In get_rel_sync_entry(), we are caching the qual for rel_sync_entry
even though we won't publish it which seems unnecessary?
3.
@@ -1193,5 +1365,11 @@ rel_sync_cache_publication_cb(Datum arg, int
cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
Seeing one previous comment in this thread [1]/messages/by-id/20181123161933.jpepibtyayflz2xg@alvherre.pgsql, I am wondering if
list_free_deep is enough here?
4. Can we write explicitly in the docs that row filters won't apply
for Truncate operation?
5. Getting some whitespace errors:
git am /d/PostgreSQL/Patches/logical_replication/row_filter/v14-0001-Row-filter-for-logical-replication.patch
.git/rebase-apply/patch:487: trailing whitespace.
warning: 1 line adds whitespace errors.
Applying: Row filter for logical replication
[1]: /messages/by-id/20181123161933.jpepibtyayflz2xg@alvherre.pgsql
--
With Regards,
Amit Kapila.
Hi,
As far as I can tell you have not *AT ALL* addressed that it is *NOT
SAFE* to evaluate arbitrary expressions from within an output
plugin. Despite that having been brought up multiple times.
+static ExprState * +pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate) +{ + ExprState *exprstate; + Oid exprtype; + Expr *expr; + + /* Prepare expression for execution */ + exprtype = exprType(rfnode); + expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1); + + if (expr == NULL) + ereport(ERROR, + (errcode(ERRCODE_CANNOT_COERCE), + errmsg("row filter returns type %s that cannot be coerced to the expected type %s", + format_type_be(exprtype), + format_type_be(BOOLOID)), + errhint("You will need to rewrite the row filter."))); + + exprstate = ExecPrepareExpr(expr, estate); + + return exprstate; +} + +/* + * Evaluates row filter. + * + * If the row filter evaluates to NULL, it is taken as false i.e. the change + * isn't replicated. + */ +static inline bool +pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext) +{ + Datum ret; + bool isnull; + + Assert(state != NULL); + + ret = ExecEvalExprSwitchContext(state, econtext, &isnull); + + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", + DatumGetBool(ret) ? "true" : "false", + isnull ? "true" : "false"); + + if (isnull) + return false; + + return DatumGetBool(ret); +}
+/* + * Change is checked against the row filter, if any. + * + * If it returns true, the change is replicated, otherwise, it is not. + */ +static bool +pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter) +{ + TupleDesc tupdesc; + EState *estate; + ExprContext *ecxt; + MemoryContext oldcxt; + ListCell *lc; + bool result = true; + + /* Bail out if there is no row filter */ + if (rowfilter == NIL) + return true; + + elog(DEBUG3, "table \"%s.%s\" has row filter", + get_namespace_name(get_rel_namespace(RelationGetRelid(relation))), + get_rel_name(relation->rd_id)); + + tupdesc = RelationGetDescr(relation); + + estate = create_estate_for_relation(relation); + + /* Prepare context per tuple */ + ecxt = GetPerTupleExprContext(estate); + oldcxt = MemoryContextSwitchTo(estate->es_query_cxt); + ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple); + MemoryContextSwitchTo(oldcxt); + + ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false); + /* + * If the subscription has multiple publications and the same table has a + * different row filter in these publications, all row filters must be + * matched in order to replicate this change. + */ + foreach(lc, rowfilter) + { + Node *rfnode = (Node *) lfirst(lc); + ExprState *exprstate; + + /* Prepare for expression execution */ + exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate); + + /* Evaluates row filter */ + result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
Also, this still seems like an *extremely* expensive thing to do for
each tuple. It'll often be *vastly* faster to just send the data than to
the other side.
This just cannot be done once per tuple. It has to be cached.
I don't see how these issues can be addressed in the next 7 days,
therefore I think this unfortunately needs to be marked as returned with
feedback.
Greetings,
Andres Freund
On Wed, Mar 31, 2021 at 12:47 PM Euler Taveira <euler@eulerto.com> wrote:
....
Good question. IIRC the issue is that AlterPublicationStmt->tables has a list
element that was a relation_expr_list and was converted to
publication_table_list. If we share 'tables' with relation_expr_list (for ALTER
PUBLICATION ... DROP TABLE) and publication_table_list (for the other ALTER
PUBLICATION ... ADD|SET TABLE), the OpenTableList() has to know what list
element it is dealing with. I think I came to the conclusion that it is less
uglier to avoid changing OpenTableList() and CloseTableList().[Doing some experimentation...]
Here is a patch that remove the referred code. It uses 2 distinct list
elements: relation_expr_list for ALTER PUBLICATION ... DROP TABLE and
publication_table_list for for ALTER PUBLICATION ... ADD|SET TABLE. A new
parameter was introduced to deal with the different elements of the list
'tables'.
AFAIK this is the latest patch available, but FYI it no longer applies
cleanly on HEAD.
git apply ../patches_misc/0001-Row-filter-for-logical-replication.patch
../patches_misc/0001-Row-filter-for-logical-replication.patch:518:
trailing whitespace.
error: patch failed: src/backend/parser/gram.y:426
error: src/backend/parser/gram.y: patch does not apply
error: patch failed: src/backend/replication/logical/worker.c:340
error: src/backend/replication/logical/worker.c: patch does not apply
--------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Mon, May 10, 2021, at 5:19 AM, Peter Smith wrote:
AFAIK this is the latest patch available, but FYI it no longer applies
cleanly on HEAD.
Peter, the last patch is broken since f3b141c4825. I'm still working on it for
the next CF. I already addressed the points suggested by Amit in his last
review; however, I'm still working on a cache for evaluating expression as
suggested by Andres. I hope to post a new patch soon.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, May 10, 2021 at 11:42 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, May 10, 2021, at 5:19 AM, Peter Smith wrote:
AFAIK this is the latest patch available, but FYI it no longer applies
cleanly on HEAD.Peter, the last patch is broken since f3b141c4825. I'm still working on it for
the next CF. I already addressed the points suggested by Amit in his last
review; however, I'm still working on a cache for evaluating expression as
suggested by Andres. I hope to post a new patch soon.
Is there any ETA for your new patch?
In the interim can you rebase the old patch just so it builds and I can try it?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Jun 9, 2021 at 5:33 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, May 10, 2021 at 11:42 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, May 10, 2021, at 5:19 AM, Peter Smith wrote:
AFAIK this is the latest patch available, but FYI it no longer applies
cleanly on HEAD.Peter, the last patch is broken since f3b141c4825. I'm still working on it for
the next CF. I already addressed the points suggested by Amit in his last
review; however, I'm still working on a cache for evaluating expression as
suggested by Andres. I hope to post a new patch soon.Is there any ETA for your new patch?
In the interim can you rebase the old patch just so it builds and I can try it?
I have rebased the patch so that you can try it out. The main thing I
have done is to remove changes in worker.c and created a specialized
function to create estate for pgoutput.c as I don't think we need what
is done in worker.c.
Euler, do let me know if you are not happy with the change in pgoutput.c?
--
With Regards,
Amit Kapila.
Attachments:
v15-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v15-0001-Row-filter-for-logical-replication.patchDownload
From 0514e9b4166ece430164b7e506955f8edc23020b Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Fri, 18 Jun 2021 16:16:41 +0530
Subject: [PATCH v15] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-14 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 38 ++++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 50 ++++++-
src/backend/commands/publicationcmds.c | 127 ++++++++++------
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 ++
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 96 +++++++++++-
src/backend/replication/pgoutput/pgoutput.c | 225 ++++++++++++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 34 +++++
src/test/regress/sql/publication.sql | 23 +++
src/test/subscription/t/020_row_filter.pl | 221 +++++++++++++++++++++++++++
23 files changed, 867 insertions(+), 95 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f517a7d..dbf2f46 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..ca091aa 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..715c37f 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression.
</para>
<para>
@@ -131,9 +135,15 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
+ </para>
+
+ <para>
+ If this parameter is <literal>false</literal>, it uses the
+ <literal>WHERE</literal> clause from the partition; otherwise, the
+ <literal>WHERE</literal> clause from the partitioned table is used.
</para>
<para>
@@ -183,6 +193,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +215,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +233,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812bee..6fcd2fd 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 14.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415a..78f5780 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +144,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +166,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +177,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +209,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +230,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c..9869484 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,8 +48,8 @@
/* Same as MAXNUMMESSAGES in sinvaladt.c */
#define MAX_RELCACHE_INVAL_MSGS 4096
-static List *OpenTableList(List *tables);
-static void CloseTableList(List *rels);
+static List *OpenTableList(List *tables, bool is_drop);
+static void CloseTableList(List *rels, bool is_drop);
static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
AlterPublicationStmt *stmt);
static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
@@ -232,9 +232,9 @@ CreatePublication(CreatePublicationStmt *stmt)
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, false);
PublicationAddTables(puboid, rels, true, NULL);
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
table_close(rel, RowExclusiveLock);
@@ -372,7 +372,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ if (stmt->tableAction == DEFELEM_DROP)
+ rels = OpenTableList(stmt->tables, true);
+ else
+ rels = OpenTableList(stmt->tables, false);
if (stmt->tableAction == DEFELEM_ADD)
PublicationAddTables(pubid, rels, false, stmt);
@@ -385,31 +388,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -421,10 +417,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
*/
PublicationAddTables(pubid, rels, true, stmt);
- CloseTableList(delrels);
+ CloseTableList(delrels, false);
}
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
/*
@@ -500,26 +496,42 @@ RemovePublicationRelById(Oid proid)
/*
* Open relations specified by a RangeVar list.
+ * AlterPublicationStmt->tables has a different list element, hence, is_drop
+ * indicates if it has a RangeVar (true) or PublicationTable (false).
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
static List *
-OpenTableList(List *tables)
+OpenTableList(List *tables, bool is_drop)
{
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (is_drop)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +550,12 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +588,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -587,16 +609,28 @@ OpenTableList(List *tables)
* Close all relations in the list.
*/
static void
-CloseTableList(List *rels)
+CloseTableList(List *rels, bool is_drop)
{
ListCell *lc;
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ if (is_drop)
+ {
+ Relation rel = (Relation) lfirst(lc);
+
+ table_close(rel, NoLock);
+ }
+ else
+ {
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
+ }
}
+
+ if (!is_drop)
+ list_free_deep(rels);
}
/*
@@ -612,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195..d82ea00 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 9562ffc..e0efe22 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index cc50eb8..42eab76 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -691,19 +691,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -799,6 +803,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -812,6 +865,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -820,7 +874,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -847,8 +908,31 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell * lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char* q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 63f108f..3ea9910 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,14 +13,21 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
@@ -61,6 +68,11 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static inline bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
+static EState *create_estate_for_relation(Relation rel);
static bool publications_valid;
static bool in_streaming;
@@ -99,6 +111,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -122,7 +135,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -520,6 +533,145 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static inline bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (rowfilter == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter \"%s\" %smatched",
+ TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+ CStringGetTextDatum(nodeToString(rfnode)),
+ ObjectIdGetDatum(relation->rd_id))),
+ result ? "" : " not");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+
+ return result;
+}
+
+/* Executor state preparation for evaluation of constraint expressions. */
+static EState*
+create_estate_for_relation(Relation rel)
+{
+ EState* estate;
+ RangeTblEntry* rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(true);
+
+ return estate;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -547,7 +699,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -591,6 +743,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -620,6 +776,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
@@ -642,6 +802,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -689,12 +853,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -704,10 +867,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -1005,16 +1168,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -1030,6 +1198,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if needed */
}
@@ -1062,6 +1231,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1121,9 +1293,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1241,6 +1433,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1250,6 +1443,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1267,5 +1462,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8f53cc7..b117951 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4156,6 +4156,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4166,9 +4167,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 140000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4177,6 +4185,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4217,6 +4226,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4249,8 +4262,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 49e1b0a..57c690f 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -624,6 +624,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255..fcdb1c2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 140000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1b31fee..9a60211 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,6 +85,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -110,7 +117,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index aecf53b..e2becb1 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, on pg_publication_rel using btree(oid oid_ops));
#define PublicationRelObjectIndexId 6112
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417b..2037705 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651..cf815cc 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7..96d869d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,40 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..35211c5 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000..35a4174
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,221 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+# UPDATE is not replicated ; row filter evaluates to false when b = NULL
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+# DELETE is not replicated ; b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Fri, Jun 18, 2021 at 9:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
[...]
I have rebased the patch so that you can try it out. The main thing I
have done is to remove changes in worker.c and created a specialized
function to create estate for pgoutput.c as I don't think we need what
is done in worker.c.
Thanks for the recent rebase.
- The v15 patch applies OK (albeit with whitespace warning)
- make check is passing OK
- the new TAP tests 020_row_filter is passing OK.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Jun 18, 2021, at 8:40 AM, Amit Kapila wrote:
I have rebased the patch so that you can try it out. The main thing I
have done is to remove changes in worker.c and created a specialized
function to create estate for pgoutput.c as I don't think we need what
is done in worker.c.Euler, do let me know if you are not happy with the change in pgoutput.c?
Amit, thanks for rebasing this patch. I already had a similar rebased patch in
my local tree. A recent patch broke your version v15 so I rebased it.
I like the idea of a simple create_estate_for_relation() function (I fixed an
oversight regarding GetCurrentCommandId(false) because it is used only for
read-only purposes). This patch also replaces all references to version 14.
Commit ef948050 made some changes in the snapshot handling. Set the current
active snapshot might not be required but future changes to allow functions
will need it.
As the previous patches, it includes commits (0002 and 0003) that are not
intended to be committed. They are available for test-only purposes.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
0001-Row-filter-for-logical-replication.patchtext/x-patch; name=0001-Row-filter-for-logical-replication.patchDownload
From 9c27d11efd2ac8257cc2664e5da82fd19f012ebc Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH 1/3] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 32 ++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 50 +++-
src/backend/commands/publicationcmds.c | 125 ++++++----
src/backend/parser/gram.y | 24 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 94 +++++++-
src/backend/replication/pgoutput/pgoutput.c | 241 ++++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 34 +++
src/test/regress/sql/publication.sql | 23 ++
src/test/subscription/t/020_row_filter.pl | 221 ++++++++++++++++++
23 files changed, 872 insertions(+), 96 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f517a7d4af..dbf2f46c00 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..5c2b7d0bd2 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
</para>
<para>
@@ -182,6 +186,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +209,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +226,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..7183700ed9 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 15.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415af89..78f5780fb7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +144,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +166,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +177,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +209,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +230,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..9637d3ddba 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,8 +48,8 @@
/* Same as MAXNUMMESSAGES in sinvaladt.c */
#define MAX_RELCACHE_INVAL_MSGS 4096
-static List *OpenTableList(List *tables);
-static void CloseTableList(List *rels);
+static List *OpenTableList(List *tables, bool is_drop);
+static void CloseTableList(List *rels, bool is_drop);
static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
AlterPublicationStmt *stmt);
static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
@@ -232,9 +232,9 @@ CreatePublication(CreatePublicationStmt *stmt)
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, false);
PublicationAddTables(puboid, rels, true, NULL);
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
table_close(rel, RowExclusiveLock);
@@ -372,7 +372,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ if (stmt->tableAction == DEFELEM_DROP)
+ rels = OpenTableList(stmt->tables, true);
+ else
+ rels = OpenTableList(stmt->tables, false);
if (stmt->tableAction == DEFELEM_ADD)
PublicationAddTables(pubid, rels, false, stmt);
@@ -385,31 +388,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationInfo *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -421,10 +417,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
*/
PublicationAddTables(pubid, rels, true, stmt);
- CloseTableList(delrels);
+ CloseTableList(delrels, false);
}
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
/*
@@ -500,26 +496,42 @@ RemovePublicationRelById(Oid proid)
/*
* Open relations specified by a RangeVar list.
+ * AlterPublicationStmt->tables has a different list element, hence, is_drop
+ * indicates if it has a RangeVar (true) or PublicationTable (false).
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
static List *
-OpenTableList(List *tables)
+OpenTableList(List *tables, bool is_drop)
{
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (is_drop)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +550,12 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +588,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -587,16 +609,28 @@ OpenTableList(List *tables)
* Close all relations in the list.
*/
static void
-CloseTableList(List *rels)
+CloseTableList(List *rels, bool is_drop)
{
ListCell *lc;
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ if (is_drop)
+ {
+ Relation rel = (Relation) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel, NoLock);
+ }
+ else
+ {
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
+
+ table_close(pri->relation, NoLock);
+ }
}
+
+ if (!is_drop)
+ list_free_deep(rels);
}
/*
@@ -612,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195438..d82ea003db 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index a25f8d5b98..5619ac6904 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32311..fc4170e723 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de7da..e946f17c64 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 682c107e74..980826a502 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -691,19 +691,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -799,6 +803,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -812,6 +865,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -820,7 +874,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -847,8 +908,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index abd5217ab1..10f85365fc 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,26 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -61,6 +69,11 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -99,6 +112,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -122,7 +136,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -520,6 +534,148 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (rowfilter == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter \"%s\" %smatched",
+ TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+ CStringGetTextDatum(nodeToString(rfnode)),
+ ObjectIdGetDatum(relation->rd_id))),
+ result ? "" : "not ");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -547,7 +703,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -571,8 +727,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -580,6 +734,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -603,6 +767,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -631,6 +801,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -689,12 +865,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -704,10 +879,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -1005,16 +1180,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -1030,6 +1210,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1063,6 +1244,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1122,9 +1306,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1242,6 +1446,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1251,6 +1456,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1268,5 +1475,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 321152151d..6f944ec60d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4172,6 +4172,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4182,9 +4183,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4193,6 +4201,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4233,6 +4242,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4265,8 +4278,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ba9bc6ddd2..7d72d498c1 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255798..e2e64cb3bf 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad4d4..333c2b581d 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504cbb..154bb61777 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417bcd7..2037705f45 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651b34..cf815cc0f2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2dd0..4537543a7b 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..96d869dd27 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,40 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..35a41741d3
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,221 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+# UPDATE is not replicated ; row filter evaluates to false when b = NULL
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+# DELETE is not replicated ; b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
0002-Measure-row-filter-overhead.patchtext/x-patch; name=0002-Measure-row-filter-overhead.patchDownload
From 0ff56520b40129c1fc1fbb1379214a1b94413a9f Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sun, 31 Jan 2021 20:48:43 -0300
Subject: [PATCH 2/3] Measure row filter overhead
---
src/backend/replication/pgoutput/pgoutput.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 10f85365fc..4677d21b62 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -618,6 +618,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+ instr_time start_time;
+ instr_time end_time;
/* Bail out if there is no row filter */
if (rowfilter == NIL)
@@ -627,6 +629,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
+ INSTR_TIME_SET_CURRENT(start_time);
+
tupdesc = RelationGetDescr(relation);
PushActiveSnapshot(GetTransactionSnapshot());
@@ -673,6 +677,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
FreeExecutorState(estate);
PopActiveSnapshot();
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+
+ elog(DEBUG2, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+
return result;
}
--
2.20.1
0003-Debug-messages.patchtext/x-patch; name=0003-Debug-messages.patchDownload
From 4275d48c9f59b6f3b2bc99b4563119da3909dd56 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Fri, 26 Feb 2021 21:08:10 -0300
Subject: [PATCH 3/3] Debug messages
---
src/backend/parser/parse_expr.c | 6 ++++++
src/test/subscription/t/020_row_filter.pl | 1 +
2 files changed, 7 insertions(+)
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index fc4170e723..bd02576c86 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -116,6 +116,12 @@ transformExprRecurse(ParseState *pstate, Node *expr)
if (expr == NULL)
return NULL;
+ /*
+ * T_FuncCall: 349
+ * EXPR_KIND_PUBLICATION_WHERE: 42
+ */
+ elog(DEBUG3, "nodeTag(expr): %d ; pstate->p_expr_kind: %d ; nodeToString(expr): %s", nodeTag(expr), pstate->p_expr_kind, nodeToString(expr));
+
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
index 35a41741d3..8c305cf1dc 100644
--- a/src/test/subscription/t/020_row_filter.pl
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -8,6 +8,7 @@ use Test::More tests => 7;
# create publisher node
my $node_publisher = get_new_node('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', 'log_min_messages = DEBUG3');
$node_publisher->start;
# create subscriber node
--
2.20.1
Hi.
I have been looking at the latest patch set (v16). Below are my review
comments and some patches.
The patches are:
v16-0001. This is identical to your previously posted 0001 patch. (I
only attached it again hoping it can allow the cfbot to keep working
OK).
v16-0002,0003. These are for demonstrating some of the review comments
v16-0004. This is a POC plan cache for your consideration.
//////////
REVIEW COMMENTS
===============
1. Patch 0001 comment - typo
you can optionally filter rows that does not satisfy a WHERE condition
typo: does/does
~~
2. Patch 0001 comment - typo
The WHERE clause should probably contain only columns that are part of
the primary key or that are covered by REPLICA IDENTITY. Otherwise,
and DELETEs won't be replicated.
typo: "Otherwise, and DELETEs" ??
~~
3. Patch 0001 comment - typo and clarification
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
Typo: "contains partitioned table" -> "contains a partitioned table"
Also, perhaps the text "or the partitioned table row filter." should
say "or the root partitioned table row filter." to disambiguate the
case where there are more levels of partitions like A->B->C. e.g. What
filter does C use?
~~
4. src/backend/catalog/pg_publication.c - misleading names
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
Leaving this parameter name as "targetrel" seems a bit misleading now
in the function code. Maybe this should be called something like "pri"
which is consistent with other places where you have declared
PublicationRelationInfo.
Also, consider declaring some local variables so that the patch may
have less impact on existing code. e.g.
Oid relid = pri->relid
Relation *targetrel = relationinfo->relation
~~
5. src/backend/commands/publicationcmds.c - simplify code
- rels = OpenTableList(stmt->tables);
+ if (stmt->tableAction == DEFELEM_DROP)
+ rels = OpenTableList(stmt->tables, true);
+ else
+ rels = OpenTableList(stmt->tables, false);
Consider writing that code more simply as just:
rels = OpenTableList(stmt->tables, stmt->tableAction == DEFELEM_DROP);
~~
6. src/backend/commands/publicationcmds.c - bug?
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
Is this a potential bug? When you called OpenTableList the 2nd param
was maybe true/false, so is it correct to be unconditionally false
here? I am not sure.
~~
7. src/backend/commands/publicationcmds.c - OpenTableList function comment.
* Open relations specified by a RangeVar list.
+ * AlterPublicationStmt->tables has a different list element, hence, is_drop
+ * indicates if it has a RangeVar (true) or PublicationTable (false).
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
I am not sure about this. Should that comment instead say "indicates
if it has a Relation (true) or PublicationTable (false)"?
~~
8. src/backend/commands/publicationcmds.c - OpenTableList
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (is_drop)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
For some reason it feels kind of clunky to me for this function to be
processing the list differently according to the 2nd param. e.g. the
name "is_drop" seems quite unrelated to the function code, and more to
do with where it was called from. Sorry, I don't have any better ideas
for improvement atm.
~~
9. src/backend/commands/publicationcmds.c - OpenTableList bug?
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
I felt maybe this is a possible bug here because there seems no code
explicitly assigning the whereClause = NULL if "is_drop" is true so
maybe it can have a garbage value which could cause problems later.
Maybe this is fixed by using palloc0.
Same thing is 2x in this function.
~~
10. src/backend/commands/publicationcmds.c - CloseTableList function comment
@@ -587,16 +609,28 @@ OpenTableList(List *tables)
* Close all relations in the list.
*/
static void
-CloseTableList(List *rels)
+CloseTableList(List *rels, bool is_drop)
{
Probably the meaning of "is_drop" should be described in this function comment.
~~
11. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry signature.
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
I see that this function signature is modified but I did not see how
this parameter refactoring is actually related to the RowFilter patch.
Perhaps I am mistaken, but IIUC this only changes the relid =
RelationGetRelid(rel); to be done inside this function instead of
being done outside by the callers.
It impacts other code like in pgoutput_truncate:
@@ -689,12 +865,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -704,10 +879,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
So maybe this is a good refactor or maybe not, but I felt this should
not be included as part of the RowFilter patch unless it is really
necessary.
~~
12. src/backend/replication/pgoutput/pgoutput.c - missing function comments
The static functions create_estate_for_relation and
pgoutput_row_filter_prepare_expr probably should be commented.
~~
13. src/backend/replication/pgoutput/pgoutput.c -
pgoutput_row_filter_prepare_expr function name
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode,
EState *estate);
This function has an unfortunate name with the word "prepare" in it. I
wonder if a different name can be found for this function to avoid any
confusion with pgoutput functions (coming soon) which are related to
the two-phase commit "prepare".
~~
14. src/bin/psql/describe.c
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
Because the where-clause value already has enclosing parentheses so
using " WHERE (%s)" seems overkill here. e.g. you can see the effect
in your src/test/regress/expected/publication.out file. I think this
should be changed to " WHERE %s" to give better output.
~~
15. src/include/catalog/pg_publication.h - new typedef
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
The new PublicationRelationInfo should also be added
src/tools/pgindent/typedefs.list
~~
16. src/include/nodes/parsenodes.h - new typedef
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
The new PublicationTable should also be added src/tools/pgindent/typedefs.list
~~
17. sql/publication.sql - show more output
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1,
testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000
AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another
WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300
AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
I felt that it would be better to have a "\dRp+ testpub5" after each
of the valid ALTER PUBLICATION steps to show the intermediate results
also; not just the final one at the end.
(PSA a temp patch showing what I mean by this review comment)
~~
18. src/test/subscription/t/020_row_filter.pl - rename file
I think this file should be renamed to 021_row_filter.pl as there is
already an 020 TAP test present.
~~
19. src/test/subscription/t/020_row_filter.pl - test comments
AFAIK the test cases are all OK, but it was really quite hard to
review these TAP tests to try to determine what the expected results
should be.
I found that I had to add my own comments to the file so I could
understand what was going on, so I think the TAP test can benefit lots
from having many more comments describing how the expected results are
determined.
Also, the filtering does not take place at the INSERT but really it is
affected only by which publications the subscription has subscribed
to. So I thought some of the existing comments (although correct) are
misplaced.
(PSA a temp patch showing what I mean by this review comment)
~~~
20. src/test/subscription/t/020_row_filter.pl - missing test case?
There are some partition tests, but I did not see any test that was
like 3 levels deep like A->B->C, so I was not sure if there is any
case C would ever make use of the filter of its parent B, or would it
only use the filter of the root A?
~~
21. src/test/subscription/t/020_row_filter.pl - missing test case?
If the same table is in multiple publications they can each have a row
filter. And a subscription might subscribe to some but not all of
those publications. I think this scenario is only partly tested.
e.g.
pub_1 has tableX with RowFilter1
pub_2 has tableX with RowFilter2
Then sub_12 subscribes to pub_1, pub_2
This is already tested in your TAP test (I think) and it makes sure
both filters are applied
But if there was also
pub_3 has tableX with RowFilter3
Then sub_12 still should only be checking the filtered RowFilter1 AND
RowFilter2 (but NOT row RowFilter3). I think this scenario is not
tested.
////////////////
POC PATCH FOR PLAN CACHE
========================
PSA a POC patch for a plan cache which gets used inside the
pgoutput_row_filter function instead of calling prepare for every row.
I think this is implementing something like Andes was suggesting a
while back [1]/messages/by-id/20210128022032.eq2qqc6zxkqn5syt@alap3.anarazel.de.
Measurements with/without this plan cache:
Time spent processing within the pgoutput_row_filter function
- Data was captured using the same technique as the
0002-Measure-row-filter-overhead.patch.
- Inserted 1000 rows, sampled data for the first 100 times in this function.
not cached: average ~ 28.48 us
cached: average ~ 9.75 us
Replication times:
- Using tables and row filters same as in Onder's commands_to_test_perf.sql [2]/messages/by-id/CACawEhW_iMnY9XK2tEb1ig+A+gKeB4cxdJcxMsoCU0SaKPExxg@mail.gmail.com
100K rows - not cached: ~ 42sec, 43sec, 44sec
100K rows - cached: ~ 41sec, 42sec, 42 sec.
There does seem to be a tiny gain achieved by having the plan cache,
but I think the gain might be a lot less than what people were
expecting.
Unless there are millions of rows the speedup may be barely noticeable.
--------
[1]: /messages/by-id/20210128022032.eq2qqc6zxkqn5syt@alap3.anarazel.de
[2]: /messages/by-id/CACawEhW_iMnY9XK2tEb1ig+A+gKeB4cxdJcxMsoCU0SaKPExxg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v16-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v16-0001-Row-filter-for-logical-replication.patchDownload
From e536f4f04c086dddabd5b005162f29d9ee0ea63b Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 1 Jul 2021 17:39:01 +1000
Subject: [PATCH v16] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 32 +++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 50 +++++-
src/backend/commands/publicationcmds.c | 127 +++++++++------
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 ++
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 94 ++++++++++-
src/backend/replication/pgoutput/pgoutput.c | 241 ++++++++++++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 34 ++++
src/test/regress/sql/publication.sql | 23 +++
src/test/subscription/t/020_row_filter.pl | 221 +++++++++++++++++++++++++
23 files changed, 873 insertions(+), 97 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f517a7d..dbf2f46 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..ca091aa 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..5c2b7d0 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
</para>
<para>
@@ -183,6 +187,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +209,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +227,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812bee..7183700 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 15.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415a..78f5780 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +144,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +166,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +177,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +209,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +230,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c..9637d3d 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,8 +48,8 @@
/* Same as MAXNUMMESSAGES in sinvaladt.c */
#define MAX_RELCACHE_INVAL_MSGS 4096
-static List *OpenTableList(List *tables);
-static void CloseTableList(List *rels);
+static List *OpenTableList(List *tables, bool is_drop);
+static void CloseTableList(List *rels, bool is_drop);
static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
AlterPublicationStmt *stmt);
static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
@@ -232,9 +232,9 @@ CreatePublication(CreatePublicationStmt *stmt)
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, false);
PublicationAddTables(puboid, rels, true, NULL);
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
table_close(rel, RowExclusiveLock);
@@ -372,7 +372,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ if (stmt->tableAction == DEFELEM_DROP)
+ rels = OpenTableList(stmt->tables, true);
+ else
+ rels = OpenTableList(stmt->tables, false);
if (stmt->tableAction == DEFELEM_ADD)
PublicationAddTables(pubid, rels, false, stmt);
@@ -385,31 +388,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -421,10 +417,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
*/
PublicationAddTables(pubid, rels, true, stmt);
- CloseTableList(delrels);
+ CloseTableList(delrels, false);
}
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
/*
@@ -500,26 +496,42 @@ RemovePublicationRelById(Oid proid)
/*
* Open relations specified by a RangeVar list.
+ * AlterPublicationStmt->tables has a different list element, hence, is_drop
+ * indicates if it has a RangeVar (true) or PublicationTable (false).
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
static List *
-OpenTableList(List *tables)
+OpenTableList(List *tables, bool is_drop)
{
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (is_drop)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +550,12 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +588,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -587,16 +609,28 @@ OpenTableList(List *tables)
* Close all relations in the list.
*/
static void
-CloseTableList(List *rels)
+CloseTableList(List *rels, bool is_drop)
{
ListCell *lc;
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ if (is_drop)
+ {
+ Relation rel = (Relation) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel, NoLock);
+ }
+ else
+ {
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
+
+ table_close(pri->relation, NoLock);
+ }
}
+
+ if (!is_drop)
+ list_free_deep(rels);
}
/*
@@ -612,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195..d82ea00 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index a25f8d5..5619ac6 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 682c107..980826a 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -691,19 +691,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -799,6 +803,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -812,6 +865,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -820,7 +874,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -847,8 +908,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index abd5217..10f8536 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,26 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -61,6 +69,11 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -99,6 +112,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -122,7 +136,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -520,6 +534,148 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (rowfilter == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter \"%s\" %smatched",
+ TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+ CStringGetTextDatum(nodeToString(rfnode)),
+ ObjectIdGetDatum(relation->rd_id))),
+ result ? "" : "not ");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -547,7 +703,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -571,8 +727,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -580,6 +734,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -603,6 +767,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -631,6 +801,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -689,12 +865,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -704,10 +879,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -1005,16 +1180,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -1030,6 +1210,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1063,6 +1244,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1122,9 +1306,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1242,6 +1446,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1251,6 +1456,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1268,5 +1475,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3211521..6f944ec 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4172,6 +4172,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4182,9 +4183,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4193,6 +4201,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4233,6 +4242,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4265,8 +4278,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ba9bc6d..7d72d49 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255..e2e64cb 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..333c2b5 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417b..2037705 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651..cf815cc 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7..96d869d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,40 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..35211c5 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000..35a4174
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,221 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+# UPDATE is not replicated ; row filter evaluates to false when b = NULL
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+# DELETE is not replicated ; b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v16-0002-PS-tmp-describe-intermediate-test-steps.patchapplication/octet-stream; name=v16-0002-PS-tmp-describe-intermediate-test-steps.patchDownload
From d76ef86ec785a5120867648eba7557ed7ab2db1b Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 2 Jul 2021 12:11:01 +1000
Subject: [PATCH v16] PS tmp - describe intermediate test steps
Added more calls to \dRp+ to show also the intermediate steps of the row filters.
---
src/test/regress/expected/publication.out | 44 +++++++++++++++++++++++++------
src/test/regress/sql/publication.sql | 5 +++-
2 files changed, 40 insertions(+), 9 deletions(-)
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 96d869d..30b7576 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -163,10 +163,46 @@ CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
ERROR: functions are not allowed in publication WHERE expressions
@@ -177,14 +213,6 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
^
-\dRp+ testpub5
- Publication testpub5
- Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
-Tables:
- "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
-
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 35211c5..5a32e16 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -100,15 +100,18 @@ CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
RESET client_min_messages;
+\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-\dRp+ testpub5
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
--
1.8.3.1
v16-0004-PS-POC-Implement-a-plan-cache-for-pgoutput.patchapplication/octet-stream; name=v16-0004-PS-POC-Implement-a-plan-cache-for-pgoutput.patchDownload
From 27298020ac1b08319ec4b3aac4f56eebdb20d4b7 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 2 Jul 2021 16:54:45 +1000
Subject: [PATCH v16] PS POC - Implement a plan cache for pgoutput.
This is a POC patch to implement plan cache which gets used inside the pgoutput_row_filter function instead of calling prepare for every row.
This is intended to implement a cache like what Andes was suggesting [1] to see what difference it makes.
Use #if 0/1 to toggle wihout/with caching.
[1] https://www.postgresql.org/message-id/20210128022032.eq2qqc6zxkqn5syt%40alap3.anarazel.de
---
src/backend/replication/pgoutput/pgoutput.c | 90 ++++++++++++++++++++++++++---
1 file changed, 82 insertions(+), 8 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 10f8536..86aa012 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -35,6 +35,7 @@
#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
+#include "optimizer/optimizer.h"
PG_MODULE_MAGIC;
@@ -72,8 +73,6 @@ static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -113,6 +112,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
List *qual;
+ List *exprstate_list;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -144,6 +144,8 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
/*
* Specify output plugin callbacks
@@ -578,6 +580,35 @@ pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
return exprstate;
}
+static ExprState *
+pgoutput_row_filter_prepare_expr2(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /* Make the exprstate long-lived by using CacheMemoryContext. */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
/*
* Evaluates row filter.
*
@@ -610,7 +641,7 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
TupleDesc tupdesc;
EState *estate;
@@ -618,11 +649,20 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+//#define RF_TIMES
+#ifdef RF_TIMES
+ instr_time start_time;
+ instr_time end_time;
+#endif
/* Bail out if there is no row filter */
- if (rowfilter == NIL)
+ if (entry->qual == NIL)
return true;
+#ifdef RF_TIMES
+ INSTR_TIME_SET_CURRENT(start_time);
+#endif
+
elog(DEBUG3, "table \"%s.%s\" has row filter",
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
@@ -646,7 +686,9 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, rowfilter)
+#if 0
+ /* Don't use cached plan. */
+ foreach(lc, entry->qual)
{
Node *rfnode = (Node *) lfirst(lc);
ExprState *exprstate;
@@ -667,12 +709,34 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
if (!result)
break;
}
+#else
+ /* Use cached plan. */
+ foreach(lc, entry->exprstate_list)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter %smatched", result ? "" : " not");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+#endif
/* Cleanup allocated resources */
ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
+#ifdef RF_TIMES
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+ elog(LOG, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+#endif
+
return result;
}
@@ -735,7 +799,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
return;
/*
@@ -768,7 +832,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
return;
maybe_send_schema(ctx, txn, change, relation, relentry);
@@ -802,7 +866,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
return;
maybe_send_schema(ctx, txn, change, relation, relentry);
@@ -1211,6 +1275,7 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->qual = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1320,10 +1385,16 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
if (!rfisnull)
{
Node *rfnode;
+ ExprState *exprstate;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
rfnode = stringToNode(TextDatumGetCString(rfdatum));
entry->qual = lappend(entry->qual, rfnode);
+
+ /* Cache the planned row filter */
+ exprstate = pgoutput_row_filter_prepare_expr2(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+
MemoryContextSwitchTo(oldctx);
}
@@ -1479,6 +1550,9 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (entry->qual != NIL)
list_free_deep(entry->qual);
entry->qual = NIL;
+
+ /* FIXME - something to be freed here? */
+ entry->exprstate_list = NIL;
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v16-0003-PS-tmp-add-more-comments-for-expected-results.patchapplication/octet-stream; name=v16-0003-PS-tmp-add-more-comments-for-expected-results.patchDownload
From aa73700ca5c44dcb753d062d08d0ffc753d3c173 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 2 Jul 2021 12:38:47 +1000
Subject: [PATCH v16] PS tmp - add more comments for expected results
No test code is changed, but this patch adds lots more comments about reasons for the expected results.
---
src/test/subscription/t/020_row_filter.pl | 89 ++++++++++++++++++++++++-------
1 file changed, 70 insertions(+), 19 deletions(-)
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
index 35a4174..e018b0d 100644
--- a/src/test/subscription/t/020_row_filter.pl
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -83,7 +83,12 @@ $node_publisher->safe_psql('postgres',
"ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
);
-# test row filtering
+# ----------------------------------------------------------
+# The following inserts come before the CREATE SUBSCRIPTION,
+# so these are for testing the initial table copy_data
+# replication.
+# ----------------------------------------------------------
+
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
$node_publisher->safe_psql('postgres',
@@ -96,20 +101,13 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
$node_publisher->safe_psql('postgres',
- "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
-);
-# use partition row filter:
-# - replicate (1, 100) because 1 < 6000 is true
-# - don't replicate (8000, 101) because 8000 < 6000 is false
-# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert into partitioned table and parttitions
$node_publisher->safe_psql('postgres',
- "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
-);
-# insert directly into partition
-# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
-# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
@@ -127,6 +125,12 @@ my $synced_query =
$node_subscriber->poll_query_until('postgres', $synced_query)
or die "Timed out while waiting for subscriber to synchronize data";
+# Check expected replicated rows for tap_row_filter_1
+# pub1 filter is: (a > 1000 AND b <> 'filtered')
+# - (1, 'not replicated') - no, because a not > 1000
+# - (1500, 'filtered') - no, because b == 'filtered'
+# - (1980, 'not filtered') - YES
+# - SELECT x, 'test ' || x FROM generate_series(990,1002) x" - YES, only for 1001,1002 because a > 1000
my $result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
@@ -134,16 +138,38 @@ is( $result, qq(1001|test 1001
1002|test 1002
1980|not filtered), 'check filtered data was copied to subscriber');
+# Check expected replicated rows for tab_row_filter_2
+# pub1 filter is: (c % 2 = 0)
+# pub2 filter is: (c % 3 = 0)
+# So only 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 should pass filter on pub1
+# So only 3, 6, 9, 12, 15, 18 should pass filter on pub2
+# So combined is 6, 12, 18, which is count 3, min 6, max 18
$result =
$node_subscriber->safe_psql('postgres',
"SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+# Check expected replicated rows for tab_row_filter_3
+# filter is null.
+# 10 rows are inserted, so 10 rows are replicated.
$result =
$node_subscriber->safe_psql('postgres',
"SELECT count(a) FROM tab_rowfilter_3");
is($result, qq(10), 'check filtered data was copied to subscriber');
+# Check expected replicated rows for partitions
+# PUBLICATION option "publish_via_partition_root" is default, so use the filter at table level
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: null
+# INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)
+# - (1,100) YES, because 1 < 6000
+# - (8000, 101) NO, because fails 8000 < 6000
+# - (15000, 102) YES, because tab_rowfilter_greater_10k has null filter
+# INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)
+# - (2, 200) YES, because 2 < 6000
+# INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)
+# - (5500, 300) YES, because 5500 < 6000 (Note: using the filter at the table, not the partition root)
$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
@@ -156,6 +182,11 @@ $result =
"SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+# ------------------------------------------------------------
+# The following operations come after the CREATE SUBSCRIPTION,
+# so these are for testing normal replication behaviour.
+# -----------------------------------------------------------
+
# test row filter (INSERT, UPDATE, DELETE)
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
@@ -165,18 +196,26 @@ $node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
-# UPDATE is not replicated ; row filter evaluates to false when b = NULL
$node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
$node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
-# DELETE is not replicated ; b is not part of the PK or replica identity and
-# old tuple contains b = NULL, hence, row filter evaluates to false
$node_publisher->safe_psql('postgres',
"DELETE FROM tab_rowfilter_1 WHERE a = 1700");
$node_publisher->wait_for_catchup($appname);
+# Check expected replicated rows for tap_row_filter_1
+# pub1 filter is: (a > 1000 AND b <> 'filtered')
+# - 1001, 1002, 1980 already exist from previous inserts
+# - (800, 'test 800') NO because 800 < 1000
+# - (1600, 'test 1600') YES
+# - (1601, 'test 1601') YES
+# - (1700, 'test 1700') YES
+# UPDATE (1600, NULL) NO. row filter evaluates to false when b = NULL
+# UPDATE (1601, 'test 1601 updated') YES
+# DELETE (1700), NO. b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
@@ -194,21 +233,33 @@ $node_subscriber->safe_psql('postgres',
"TRUNCATE TABLE tab_rowfilter_partitioned");
$node_subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
-# use partitioned table row filter: replicate, 4000 < 5000 is true
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
-# use partitioned table row filter: replicate, 4500 < 5000 is true
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
-# use partitioned table row filter: don't replicate, 5600 < 5000 is false
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
-# use partitioned table row filter: don't replicate, 16000 < 5000 is false
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
$node_publisher->wait_for_catchup($appname);
+# Check expected replicated rows for partitions
+# PUBLICATION option "publish_via_partition_root = true" is default, so use the filter at root level
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: null
+# Existing INSERTS (copied because of copy_data=true option)
+# - (1,100) YES, 1 < 5000
+# - (8000, 101) NO, fails 8000 < 5000
+# - (15000, 102) NO, fails 15000 < 5000
+# - (2, 200) YES, 2 < 6000
+# - (5500, 300) NO, fails 5500 < 5000
+# New INSERTS replicated (after the initial copy_data)?
+# - VALUES(4000, 400) YES, 4000 < 5000
+# - VALUES(4500, 450) YES, 4500 < 5000
+# - VALUES(5600, 123) NO fails 5600 < 5000
+# - VALUES(16000, 1950) NO fails 16000 < 5000
$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
--
1.8.3.1
On Thu, Jul 1, 2021 at 10:43 AM Euler Taveira <euler@eulerto.com> wrote:
Amit, thanks for rebasing this patch. I already had a similar rebased patch in
my local tree. A recent patch broke your version v15 so I rebased it.I like the idea of a simple create_estate_for_relation() function (I fixed an
oversight regarding GetCurrentCommandId(false) because it is used only for
read-only purposes). This patch also replaces all references to version 14.Commit ef948050 made some changes in the snapshot handling. Set the current
active snapshot might not be required but future changes to allow functions
will need it.As the previous patches, it includes commits (0002 and 0003) that are not
intended to be committed. They are available for test-only purposes.
I have some review comments on the "Row filter for logical replication" patch:
(1) Suggested update to patch comment:
(There are some missing words and things which could be better expressed)
This feature adds row filtering for publication tables.
When a publication is defined or modified, rows that don't satisfy a WHERE
clause may be optionally filtered out. This allows a database or set of
tables to be partially replicated. The row filter is per table, which allows
different row filters to be defined for different tables. A new row filter
can be added simply by specifying a WHERE clause after the table name.
The WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; that could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
(2) Some inconsistent error message wording:
Currently:
err = _("cannot use subquery in publication WHERE expression");
Suggest changing it to:
err = _("subqueries are not allowed in publication WHERE expressions");
Other examples from the patch:
err = _("aggregate functions are not allowed in publication WHERE expressions");
err = _("grouping operations are not allowed in publication WHERE expressions");
err = _("window functions are not allowed in publication WHERE expressions");
errmsg("functions are not allowed in publication WHERE expressions"),
err = _("set-returning functions are not allowed in publication WHERE
expressions");
(3) The current code still allows arbitrary code execution, e.g. via a
user-defined operator:
e.g.
publisher:
CREATE OR REPLACE FUNCTION myop(left_arg INTEGER, right_arg INTEGER)
RETURNS BOOL AS
$$
BEGIN
RAISE NOTICE 'I can do anything here!';
RETURN left_arg > right_arg;
END;
$$ LANGUAGE PLPGSQL VOLATILE;
CREATE OPERATOR >>>> (
PROCEDURE = myop,
LEFTARG = INTEGER,
RIGHTARG = INTEGER
);
CREATE PUBLICATION tap_pub FOR TABLE test_tab WHERE (a >>>> 5);
subscriber:
CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost dbname=test_pub
application_name=tap_sub' PUBLICATION tap_pub;
Perhaps add the following after the existing shell error-check in make_op():
/* User-defined operators are not allowed in publication WHERE clauses */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid
= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication
WHERE expressions"),
parser_errposition(pstate, location)));
Also, I believe it's also allowing user-defined CASTs (so could add a
similar check to above in transformTypeCast()).
Ideally, it would be preferable to validate/check publication WHERE
expressions in one central place, rather than scattered all over the
place, but that might be easier said than done.
You need to update the patch comment accordingly.
(4) src/backend/replication/pgoutput/pgoutput.c
pgoutput_change()
The 3 added calls to pgoutput_row_filter() are returning from
pgoutput_change(), if false is returned, but instead they should break
from the switch, otherwise cleanup code is missed. This is surely a
bug.
e.g.
(3 similar cases of this)
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
should be:
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ break;
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Jul 1, 2021 at 10:43 AM Euler Taveira <euler@eulerto.com> wrote:
Amit, thanks for rebasing this patch. I already had a similar rebased patch in
my local tree. A recent patch broke your version v15 so I rebased it.
Hi,
I did some testing of the performance of the row filtering, in the
case of the publisher INSERTing 100,000 rows, using a similar test
setup and timing as previously used in the “commands_to_perf_test.sql“
script posted by Önder Kalacı.
I found that with the call to ExecInitExtraTupleSlot() in
pgoutput_row_filter(), then the performance of pgoutput_row_filter()
degrades considerably over the 100,000 invocations, and on my system
it took about 43 seconds to filter and send to the subscriber.
However, by caching the tuple table slot in RelationSyncEntry, this
duration can be dramatically reduced by 38+ seconds.
A further improvement can be made using this in combination with
Peter's plan cache (v16-0004).
I've attached a patch for this, which relies on the latest v16-0001
and v16-0004 patches posted by Peter Smith (noting that v16-0001 is
identical to your previously-posted 0001 patch).
Also attached is a graph (created by Peter Smith – thanks!) detailing
the performance improvement.
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v16-0005-Improve-row-filtering-performance.patchapplication/octet-stream; name=v16-0005-Improve-row-filtering-performance.patchDownload
From e9b4114382ee7606d06a5ceae8f85053f0dca32f Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Wed, 7 Jul 2021 13:11:09 +1000
Subject: [PATCH v16] Substantially improve performance of
pgoutput_row_filter().
Repeated tuple table slot creation in pgoutput_row_filter() results in degraded
performance and large memory usage. This is greatly improved by caching the row
filtering tuple table slot in the relation sync cache.
---
src/backend/replication/pgoutput/pgoutput.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 86aa012505..d49fb37e1f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -113,6 +113,8 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
List *qual;
List *exprstate_list;
+ TupleTableSlot *scantuple;
+ List *tuple_table;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -643,10 +645,8 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
static bool
pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
- TupleDesc tupdesc;
EState *estate;
ExprContext *ecxt;
- MemoryContext oldcxt;
ListCell *lc;
bool result = true;
//#define RF_TIMES
@@ -667,17 +667,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
- tupdesc = RelationGetDescr(relation);
-
PushActiveSnapshot(GetTransactionSnapshot());
estate = create_estate_for_relation(relation);
/* Prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
- oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
- ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldcxt);
+ ecxt->ecxt_scantuple = entry->scantuple;
ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
@@ -1268,6 +1264,8 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
/* Not found means schema wasn't sent */
if (!found)
{
+ TupleDesc tupdesc;
+
/* immediately make a new entry valid enough to satisfy callbacks */
entry->schema_sent = false;
entry->streamed_txns = NIL;
@@ -1279,6 +1277,13 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
+
+ /* create a tuple table slot for use in row filtering */
+ entry->tuple_table = NIL;
+ tupdesc = RelationGetDescr(rel);
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ entry->scantuple = ExecAllocTableSlot(&entry->tuple_table, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
}
/* Validate the entry */
--
2.27.0
On Wed, Jul 7, 2021, at 2:24 AM, Greg Nancarrow wrote:
I found that with the call to ExecInitExtraTupleSlot() in
pgoutput_row_filter(), then the performance of pgoutput_row_filter()
degrades considerably over the 100,000 invocations, and on my system
it took about 43 seconds to filter and send to the subscriber.
However, by caching the tuple table slot in RelationSyncEntry, this
duration can be dramatically reduced by 38+ seconds.
A further improvement can be made using this in combination with
Peter's plan cache (v16-0004).
I've attached a patch for this, which relies on the latest v16-0001
and v16-0004 patches posted by Peter Smith (noting that v16-0001 is
identical to your previously-posted 0001 patch).
Also attached is a graph (created by Peter Smith – thanks!) detailing
the performance improvement.
Greg, I like your suggestion and already integrate it (I replaced
ExecAllocTableSlot() with MakeSingleTupleTableSlot() because we don't need the
List). I'm still working on a new version to integrate all suggestions that you
and Peter did. I have a similar code to Peter's plan cache and I'm working on
merging both ideas together. I'm done for today but I'll continue tomorrow.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Thu, Jul 8, 2021 at 10:34 AM Euler Taveira <euler@eulerto.com> wrote:
Greg, I like your suggestion and already integrate it (I replaced
ExecAllocTableSlot() with MakeSingleTupleTableSlot() because we don't need the
List).
Yes I agree, I found the same thing, it's not needed.
I'm still working on a new version to integrate all suggestions that you
and Peter did. I have a similar code to Peter's plan cache and I'm working on
merging both ideas together. I'm done for today but I'll continue tomorrow.
I also realised that my 0005 patch wasn't handling RelationSyncEntry
invalidation, so I've updated it.
For completeness, I'm posting the complete patch set with the updates,
so you can look at it and compare with yours, and also it'll keep the
cfbot happy until you post your updated patch.
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v17-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v17-0001-Row-filter-for-logical-replication.patchDownload
From 6c15a8ebd5fb5b89ea775140e32a1aae741732b1 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 1 Jul 2021 17:39:01 +1000
Subject: [PATCH v17 1/5] Row filter for logical replication
This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 32 ++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 50 +++-
src/backend/commands/publicationcmds.c | 127 +++++++----
src/backend/parser/gram.y | 24 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/replication/logical/tablesync.c | 94 +++++++-
src/backend/replication/pgoutput/pgoutput.c | 241 ++++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 34 +++
src/test/regress/sql/publication.sql | 23 ++
src/test/subscription/t/020_row_filter.pl | 221 ++++++++++++++++++
23 files changed, 873 insertions(+), 97 deletions(-)
create mode 100644 src/test/subscription/t/020_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f517a7d4af..dbf2f46c00 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..5c2b7d0bd2 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
</para>
<para>
@@ -182,6 +186,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +209,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +226,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..7183700ed9 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 15.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415af89..78f5780fb7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,18 +144,20 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -161,7 +166,7 @@ publication_add_relation(Oid pubid, Relation targetrel,
* duplicates, it's here just to provide nicer error message in common
* case. The real protection is the unique key on the catalog.
*/
- if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+ if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(targetrel->relid),
ObjectIdGetDatum(pubid)))
{
table_close(rel, RowExclusiveLock);
@@ -172,10 +177,27 @@ publication_add_relation(Oid pubid, Relation targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel), pub->name)));
+ RelationGetRelationName(targetrel->relation), pub->name)));
}
- check_publication_add_relation(targetrel);
+ check_publication_add_relation(targetrel->relation);
+
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(targetrel->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(targetrel->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -187,7 +209,13 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prpubid - 1] =
ObjectIdGetDatum(pubid);
values[Anum_pg_publication_rel_prrelid - 1] =
- ObjectIdGetDatum(relid);
+ ObjectIdGetDatum(targetrel->relid);
+
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -202,14 +230,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
/* Add dependency on the relation */
- ObjectAddressSet(referenced, RelationRelationId, relid);
+ ObjectAddressSet(referenced, RelationRelationId, targetrel->relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel);
+ CacheInvalidateRelcache(targetrel->relation);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..9637d3ddba 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,8 +48,8 @@
/* Same as MAXNUMMESSAGES in sinvaladt.c */
#define MAX_RELCACHE_INVAL_MSGS 4096
-static List *OpenTableList(List *tables);
-static void CloseTableList(List *rels);
+static List *OpenTableList(List *tables, bool is_drop);
+static void CloseTableList(List *rels, bool is_drop);
static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
AlterPublicationStmt *stmt);
static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
@@ -232,9 +232,9 @@ CreatePublication(CreatePublicationStmt *stmt)
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, false);
PublicationAddTables(puboid, rels, true, NULL);
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
table_close(rel, RowExclusiveLock);
@@ -372,7 +372,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ if (stmt->tableAction == DEFELEM_DROP)
+ rels = OpenTableList(stmt->tables, true);
+ else
+ rels = OpenTableList(stmt->tables, false);
if (stmt->tableAction == DEFELEM_ADD)
PublicationAddTables(pubid, rels, false, stmt);
@@ -385,31 +388,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -421,10 +417,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
*/
PublicationAddTables(pubid, rels, true, stmt);
- CloseTableList(delrels);
+ CloseTableList(delrels, false);
}
- CloseTableList(rels);
+ CloseTableList(rels, false);
}
/*
@@ -500,26 +496,42 @@ RemovePublicationRelById(Oid proid)
/*
* Open relations specified by a RangeVar list.
+ * AlterPublicationStmt->tables has a different list element, hence, is_drop
+ * indicates if it has a RangeVar (true) or PublicationTable (false).
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
static List *
-OpenTableList(List *tables)
+OpenTableList(List *tables, bool is_drop)
{
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (is_drop)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +550,12 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +588,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (!is_drop)
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -587,16 +609,28 @@ OpenTableList(List *tables)
* Close all relations in the list.
*/
static void
-CloseTableList(List *rels)
+CloseTableList(List *rels, bool is_drop)
{
ListCell *lc;
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ if (is_drop)
+ {
+ Relation rel = (Relation) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(rel, NoLock);
+ }
+ else
+ {
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
+
+ table_close(pri->relation, NoLock);
+ }
}
+
+ if (!is_drop)
+ list_free_deep(rels);
}
/*
@@ -612,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195438..d82ea003db 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb502..8fb953b54f 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32311..fc4170e723 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de7da..e946f17c64 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 682c107e74..980826a502 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -691,19 +691,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -799,6 +803,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -812,6 +865,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -820,7 +874,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -847,8 +908,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index abd5217ab1..10f85365fc 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,26 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -61,6 +69,11 @@ static void pgoutput_stream_abort(struct LogicalDecodingContext *ctx,
static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
XLogRecPtr commit_lsn);
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -99,6 +112,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -122,7 +136,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -520,6 +534,148 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+static ExprState *
+pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ exprstate = ExecPrepareExpr(expr, estate);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+{
+ TupleDesc tupdesc;
+ EState *estate;
+ ExprContext *ecxt;
+ MemoryContext oldcxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (rowfilter == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ tupdesc = RelationGetDescr(relation);
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldcxt);
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, rowfilter)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_prepare_expr(rfnode, estate);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter \"%s\" %smatched",
+ TextDatumGetCString(DirectFunctionCall2(pg_get_expr,
+ CStringGetTextDatum(nodeToString(rfnode)),
+ ObjectIdGetDatum(relation->rd_id))),
+ result ? "" : "not ");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -547,7 +703,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -571,8 +727,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -580,6 +734,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ return;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -603,6 +767,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ return;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -631,6 +801,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ return;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -689,12 +865,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
for (i = 0; i < nrelations; i++)
{
Relation relation = relations[i];
- Oid relid = RelationGetRelid(relation);
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -704,10 +879,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* root tables through it.
*/
if (relation->rd_rel->relispartition &&
- relentry->publish_as_relid != relid)
+ relentry->publish_as_relid != relentry->relid)
continue;
- relids[nrelids++] = relid;
+ relids[nrelids++] = relentry->relid;
maybe_send_schema(ctx, txn, change, relation, relentry);
}
@@ -1005,16 +1180,21 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
{
RelationSyncEntry *entry;
- bool am_partition = get_rel_relispartition(relid);
- char relkind = get_rel_relkind(relid);
+ Oid relid;
+ bool am_partition;
+ char relkind;
bool found;
MemoryContext oldctx;
Assert(RelationSyncCache != NULL);
+ relid = RelationGetRelid(rel);
+ am_partition = get_rel_relispartition(relid);
+ relkind = get_rel_relkind(relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -1030,6 +1210,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1063,6 +1244,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1122,9 +1306,29 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1242,6 +1446,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1251,6 +1456,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1268,5 +1475,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 321152151d..6f944ec60d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4172,6 +4172,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4182,9 +4183,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4193,6 +4201,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4233,6 +4242,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4265,8 +4278,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ba9bc6ddd2..7d72d498c1 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255798..e2e64cb3bf 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad4d4..333c2b581d 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504cbb..154bb61777 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417bcd7..2037705f45 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651b34..cf815cc0f2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2dd0..4537543a7b 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..96d869dd27 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,40 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..35211c56f6 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,29 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+\dRp+ testpub5
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
new file mode 100644
index 0000000000..35a41741d3
--- /dev/null
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -0,0 +1,221 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+
+# test row filtering
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+# use partition row filter:
+# - replicate (1, 100) because 1 < 6000 is true
+# - don't replicate (8000, 101) because 8000 < 6000 is false
+# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
+);
+# insert directly into partition
+# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
+# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check filtered data was copied to subscriber');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+# UPDATE is not replicated ; row filter evaluates to false when b = NULL
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+# DELETE is not replicated ; b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check filtered data was copied to subscriber');
+
+# publish using partitioned table
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+# use partitioned table row filter: replicate, 4000 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
+# use partitioned table row filter: replicate, 4500 < 5000 is true
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+# use partitioned table row filter: don't replicate, 5600 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+# use partitioned table row filter: don't replicate, 16000 < 5000 is false
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.27.0
v17-0002-PS-tmp-describe-intermediate-test-steps.patchapplication/octet-stream; name=v17-0002-PS-tmp-describe-intermediate-test-steps.patchDownload
From d6c682b7e25a60502dfd88433487fa1bbedf195c Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 2 Jul 2021 12:11:01 +1000
Subject: [PATCH v17 2/5] PS tmp - describe intermediate test steps
Added more calls to \dRp+ to show also the intermediate steps of the row filters.
---
src/test/regress/expected/publication.out | 44 ++++++++++++++++++-----
src/test/regress/sql/publication.sql | 5 ++-
2 files changed, 40 insertions(+), 9 deletions(-)
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 96d869dd27..30b7576138 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -163,10 +163,46 @@ CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
ERROR: functions are not allowed in publication WHERE expressions
@@ -177,14 +213,6 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
^
-\dRp+ testpub5
- Publication testpub5
- Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
-Tables:
- "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
-
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 35211c56f6..5a32e1650d 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -100,15 +100,18 @@ CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
RESET client_min_messages;
+\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-\dRp+ testpub5
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
--
2.27.0
v17-0003-PS-tmp-add-more-comments-for-expected-results.patchapplication/octet-stream; name=v17-0003-PS-tmp-add-more-comments-for-expected-results.patchDownload
From 59e72367c823ea3ace472da0103147ba780b1cef Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 2 Jul 2021 12:38:47 +1000
Subject: [PATCH v17 3/5] PS tmp - add more comments for expected results
No test code is changed, but this patch adds lots more comments about reasons for the expected results.
---
src/test/subscription/t/020_row_filter.pl | 89 ++++++++++++++++++-----
1 file changed, 70 insertions(+), 19 deletions(-)
diff --git a/src/test/subscription/t/020_row_filter.pl b/src/test/subscription/t/020_row_filter.pl
index 35a41741d3..e018b0d08c 100644
--- a/src/test/subscription/t/020_row_filter.pl
+++ b/src/test/subscription/t/020_row_filter.pl
@@ -83,7 +83,12 @@ $node_publisher->safe_psql('postgres',
"ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
);
-# test row filtering
+# ----------------------------------------------------------
+# The following inserts come before the CREATE SUBSCRIPTION,
+# so these are for testing the initial table copy_data
+# replication.
+# ----------------------------------------------------------
+
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
$node_publisher->safe_psql('postgres',
@@ -96,20 +101,13 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
$node_publisher->safe_psql('postgres',
- "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
-);
-# use partition row filter:
-# - replicate (1, 100) because 1 < 6000 is true
-# - don't replicate (8000, 101) because 8000 < 6000 is false
-# - replicate (15000, 102) because partition tab_rowfilter_greater_10k doesn't have row filter
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert into partitioned table and parttitions
$node_publisher->safe_psql('postgres',
- "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)"
-);
-# insert directly into partition
-# use partition row filter: replicate (2, 200) because 2 < 6000 is true
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)");
-# use partition row filter: replicate (5500, 300) because 5500 < 6000 is true
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)");
@@ -127,6 +125,12 @@ my $synced_query =
$node_subscriber->poll_query_until('postgres', $synced_query)
or die "Timed out while waiting for subscriber to synchronize data";
+# Check expected replicated rows for tap_row_filter_1
+# pub1 filter is: (a > 1000 AND b <> 'filtered')
+# - (1, 'not replicated') - no, because a not > 1000
+# - (1500, 'filtered') - no, because b == 'filtered'
+# - (1980, 'not filtered') - YES
+# - SELECT x, 'test ' || x FROM generate_series(990,1002) x" - YES, only for 1001,1002 because a > 1000
my $result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
@@ -134,16 +138,38 @@ is( $result, qq(1001|test 1001
1002|test 1002
1980|not filtered), 'check filtered data was copied to subscriber');
+# Check expected replicated rows for tab_row_filter_2
+# pub1 filter is: (c % 2 = 0)
+# pub2 filter is: (c % 3 = 0)
+# So only 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 should pass filter on pub1
+# So only 3, 6, 9, 12, 15, 18 should pass filter on pub2
+# So combined is 6, 12, 18, which is count 3, min 6, max 18
$result =
$node_subscriber->safe_psql('postgres',
"SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
is($result, qq(3|6|18), 'check filtered data was copied to subscriber');
+# Check expected replicated rows for tab_row_filter_3
+# filter is null.
+# 10 rows are inserted, so 10 rows are replicated.
$result =
$node_subscriber->safe_psql('postgres',
"SELECT count(a) FROM tab_rowfilter_3");
is($result, qq(10), 'check filtered data was copied to subscriber');
+# Check expected replicated rows for partitions
+# PUBLICATION option "publish_via_partition_root" is default, so use the filter at table level
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: null
+# INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(8000, 101),(15000, 102)
+# - (1,100) YES, because 1 < 6000
+# - (8000, 101) NO, because fails 8000 < 6000
+# - (15000, 102) YES, because tab_rowfilter_greater_10k has null filter
+# INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200)
+# - (2, 200) YES, because 2 < 6000
+# INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(5500, 300)
+# - (5500, 300) YES, because 5500 < 6000 (Note: using the filter at the table, not the partition root)
$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
@@ -156,6 +182,11 @@ $result =
"SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
is($result, qq(15000|102), 'check filtered data was copied to subscriber');
+# ------------------------------------------------------------
+# The following operations come after the CREATE SUBSCRIPTION,
+# so these are for testing normal replication behaviour.
+# -----------------------------------------------------------
+
# test row filter (INSERT, UPDATE, DELETE)
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
@@ -165,18 +196,26 @@ $node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
-# UPDATE is not replicated ; row filter evaluates to false when b = NULL
$node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
$node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
-# DELETE is not replicated ; b is not part of the PK or replica identity and
-# old tuple contains b = NULL, hence, row filter evaluates to false
$node_publisher->safe_psql('postgres',
"DELETE FROM tab_rowfilter_1 WHERE a = 1700");
$node_publisher->wait_for_catchup($appname);
+# Check expected replicated rows for tap_row_filter_1
+# pub1 filter is: (a > 1000 AND b <> 'filtered')
+# - 1001, 1002, 1980 already exist from previous inserts
+# - (800, 'test 800') NO because 800 < 1000
+# - (1600, 'test 1600') YES
+# - (1601, 'test 1601') YES
+# - (1700, 'test 1700') YES
+# UPDATE (1600, NULL) NO. row filter evaluates to false when b = NULL
+# UPDATE (1601, 'test 1601 updated') YES
+# DELETE (1700), NO. b is not part of the PK or replica identity and
+# old tuple contains b = NULL, hence, row filter evaluates to false
$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
@@ -194,21 +233,33 @@ $node_subscriber->safe_psql('postgres',
"TRUNCATE TABLE tab_rowfilter_partitioned");
$node_subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
-# use partitioned table row filter: replicate, 4000 < 5000 is true
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400)");
-# use partitioned table row filter: replicate, 4500 < 5000 is true
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
-# use partitioned table row filter: don't replicate, 5600 < 5000 is false
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
-# use partitioned table row filter: don't replicate, 16000 < 5000 is false
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 1950)");
$node_publisher->wait_for_catchup($appname);
+# Check expected replicated rows for partitions
+# PUBLICATION option "publish_via_partition_root = true" is default, so use the filter at root level
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: null
+# Existing INSERTS (copied because of copy_data=true option)
+# - (1,100) YES, 1 < 5000
+# - (8000, 101) NO, fails 8000 < 5000
+# - (15000, 102) NO, fails 15000 < 5000
+# - (2, 200) YES, 2 < 6000
+# - (5500, 300) NO, fails 5500 < 5000
+# New INSERTS replicated (after the initial copy_data)?
+# - VALUES(4000, 400) YES, 4000 < 5000
+# - VALUES(4500, 450) YES, 4500 < 5000
+# - VALUES(5600, 123) NO fails 5600 < 5000
+# - VALUES(16000, 1950) NO fails 16000 < 5000
$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
--
2.27.0
v17-0004-PS-POC-Implement-a-plan-cache-for-pgoutput.patchapplication/octet-stream; name=v17-0004-PS-POC-Implement-a-plan-cache-for-pgoutput.patchDownload
From 002ce81dbae3df85610f65a235b45d046af0830f Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 2 Jul 2021 16:54:45 +1000
Subject: [PATCH v17 4/5] PS POC - Implement a plan cache for pgoutput.
This is a POC patch to implement plan cache which gets used inside the pgoutput_row_filter function instead of calling prepare for every row.
This is intended to implement a cache like what Andes was suggesting [1] to see what difference it makes.
Use #if 0/1 to toggle wihout/with caching.
[1] https://www.postgresql.org/message-id/20210128022032.eq2qqc6zxkqn5syt%40alap3.anarazel.de
---
src/backend/replication/pgoutput/pgoutput.c | 90 +++++++++++++++++++--
1 file changed, 82 insertions(+), 8 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 10f85365fc..86aa012505 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -35,6 +35,7 @@
#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
+#include "optimizer/optimizer.h"
PG_MODULE_MAGIC;
@@ -72,8 +73,6 @@ static void pgoutput_stream_commit(struct LogicalDecodingContext *ctx,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, List *rowfilter);
static bool publications_valid;
static bool in_streaming;
@@ -113,6 +112,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
List *qual;
+ List *exprstate_list;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -144,6 +144,8 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
/*
* Specify output plugin callbacks
@@ -578,6 +580,35 @@ pgoutput_row_filter_prepare_expr(Node *rfnode, EState *estate)
return exprstate;
}
+static ExprState *
+pgoutput_row_filter_prepare_expr2(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /* Make the exprstate long-lived by using CacheMemoryContext. */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
/*
* Evaluates row filter.
*
@@ -610,7 +641,7 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, List *rowfilter)
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
TupleDesc tupdesc;
EState *estate;
@@ -618,11 +649,20 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
MemoryContext oldcxt;
ListCell *lc;
bool result = true;
+//#define RF_TIMES
+#ifdef RF_TIMES
+ instr_time start_time;
+ instr_time end_time;
+#endif
/* Bail out if there is no row filter */
- if (rowfilter == NIL)
+ if (entry->qual == NIL)
return true;
+#ifdef RF_TIMES
+ INSTR_TIME_SET_CURRENT(start_time);
+#endif
+
elog(DEBUG3, "table \"%s.%s\" has row filter",
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
@@ -646,7 +686,9 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, rowfilter)
+#if 0
+ /* Don't use cached plan. */
+ foreach(lc, entry->qual)
{
Node *rfnode = (Node *) lfirst(lc);
ExprState *exprstate;
@@ -667,12 +709,34 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, L
if (!result)
break;
}
+#else
+ /* Use cached plan. */
+ foreach(lc, entry->exprstate_list)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter %smatched", result ? "" : " not");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+#endif
/* Cleanup allocated resources */
ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
+#ifdef RF_TIMES
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+ elog(LOG, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+#endif
+
return result;
}
@@ -735,7 +799,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry->qual))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
return;
/*
@@ -768,7 +832,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry->qual))
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
return;
maybe_send_schema(ctx, txn, change, relation, relentry);
@@ -802,7 +866,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry->qual))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
return;
maybe_send_schema(ctx, txn, change, relation, relentry);
@@ -1211,6 +1275,7 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->qual = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1320,10 +1385,16 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
if (!rfisnull)
{
Node *rfnode;
+ ExprState *exprstate;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
rfnode = stringToNode(TextDatumGetCString(rfdatum));
entry->qual = lappend(entry->qual, rfnode);
+
+ /* Cache the planned row filter */
+ exprstate = pgoutput_row_filter_prepare_expr2(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+
MemoryContextSwitchTo(oldctx);
}
@@ -1479,6 +1550,9 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (entry->qual != NIL)
list_free_deep(entry->qual);
entry->qual = NIL;
+
+ /* FIXME - something to be freed here? */
+ entry->exprstate_list = NIL;
}
MemoryContextSwitchTo(oldctx);
--
2.27.0
v17-0005-Improve-row-filtering-performance.patchapplication/octet-stream; name=v17-0005-Improve-row-filtering-performance.patchDownload
From c3b5b555dfaef57162e3752d693f874d2c241387 Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Thu, 8 Jul 2021 10:21:39 +1000
Subject: [PATCH v17 5/5] Substantially improve performance of
pgoutput_row_filter().
Repeated tuple table slot creation in pgoutput_row_filter() results in degraded
performance and large memory usage. This is greatly improved by caching the row
filtering tuple table slot in the relation sync cache.
---
src/backend/replication/pgoutput/pgoutput.c | 25 +++++++++++++++------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 86aa012505..1b0fc64392 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -113,6 +113,7 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
List *qual;
List *exprstate_list;
+ TupleTableSlot *scantuple;
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -643,10 +644,8 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
static bool
pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
- TupleDesc tupdesc;
EState *estate;
ExprContext *ecxt;
- MemoryContext oldcxt;
ListCell *lc;
bool result = true;
//#define RF_TIMES
@@ -667,17 +666,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
- tupdesc = RelationGetDescr(relation);
-
PushActiveSnapshot(GetTransactionSnapshot());
estate = create_estate_for_relation(relation);
/* Prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
- oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
- ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldcxt);
+ ecxt->ecxt_scantuple = entry->scantuple;
ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
@@ -1279,15 +1274,31 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
+
+ entry->scantuple = NULL;
}
/* Validate the entry */
if (!entry->replicate_valid)
{
+ TupleDesc tupdesc;
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ /* Release any existing tuple table slot */
+ if (entry->scantuple)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /* Create a tuple table slot for use in row filtering */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(RelationGetDescr(rel));
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/* Reload publications if needed before use. */
if (!publications_valid)
{
--
2.27.0
On Fri, Jul 2, 2021, at 4:29 AM, Peter Smith wrote:
Hi.
I have been looking at the latest patch set (v16). Below are my review
comments and some patches.
Peter, thanks for your detailed review. Comments are inline.
1. Patch 0001 comment - typo
you can optionally filter rows that does not satisfy a WHERE condition
typo: does/does
Fixed.
2. Patch 0001 comment - typo
The WHERE clause should probably contain only columns that are part of
the primary key or that are covered by REPLICA IDENTITY. Otherwise,
and DELETEs won't be replicated.typo: "Otherwise, and DELETEs" ??
Fixed.
3. Patch 0001 comment - typo and clarification
If your publication contains partitioned table, the parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false -- default) or the partitioned table row filter.Typo: "contains partitioned table" -> "contains a partitioned table"
Fixed.
Also, perhaps the text "or the partitioned table row filter." should
say "or the root partitioned table row filter." to disambiguate the
case where there are more levels of partitions like A->B->C. e.g. What
filter does C use?
I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the
root partitioned table is used. We should improve that sentence too.
4. src/backend/catalog/pg_publication.c - misleading names
-publication_add_relation(Oid pubid, Relation targetrel, +publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel, bool if_not_exists)Leaving this parameter name as "targetrel" seems a bit misleading now
in the function code. Maybe this should be called something like "pri"
which is consistent with other places where you have declared
PublicationRelationInfo.Also, consider declaring some local variables so that the patch may
have less impact on existing code. e.g.
Oid relid = pri->relid
Relation *targetrel = relationinfo->relation
Done.
5. src/backend/commands/publicationcmds.c - simplify code
- rels = OpenTableList(stmt->tables); + if (stmt->tableAction == DEFELEM_DROP) + rels = OpenTableList(stmt->tables, true); + else + rels = OpenTableList(stmt->tables, false);Consider writing that code more simply as just:
rels = OpenTableList(stmt->tables, stmt->tableAction == DEFELEM_DROP);
It is not a common pattern to use an expression as a function argument in
Postgres. I prefer to use a variable with a suggestive name.
6. src/backend/commands/publicationcmds.c - bug?
- CloseTableList(rels); + CloseTableList(rels, false); }Is this a potential bug? When you called OpenTableList the 2nd param
was maybe true/false, so is it correct to be unconditionally false
here? I am not sure.
Good catch.
7. src/backend/commands/publicationcmds.c - OpenTableList function comment.
* Open relations specified by a RangeVar list. + * AlterPublicationStmt->tables has a different list element, hence, is_drop + * indicates if it has a RangeVar (true) or PublicationTable (false). * The returned tables are locked in ShareUpdateExclusiveLock mode in order to * add them to a publication.I am not sure about this. Should that comment instead say "indicates
if it has a Relation (true) or PublicationTable (false)"?
Fixed.
8. src/backend/commands/publicationcmds.c - OpenTableList
8
For some reason it feels kind of clunky to me for this function to be
processing the list differently according to the 2nd param. e.g. the
name "is_drop" seems quite unrelated to the function code, and more to
do with where it was called from. Sorry, I don't have any better ideas
for improvement atm.
My suggestion is to rename it to "pub_drop_table".
9. src/backend/commands/publicationcmds.c - OpenTableList bug?
8
I felt maybe this is a possible bug here because there seems no code
explicitly assigning the whereClause = NULL if "is_drop" is true so
maybe it can have a garbage value which could cause problems later.
Maybe this is fixed by using palloc0.
Fixed.
10. src/backend/commands/publicationcmds.c - CloseTableList function comment
8
Probably the meaning of "is_drop" should be described in this function comment.
Done.
11. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry signature.
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid); +static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);I see that this function signature is modified but I did not see how
this parameter refactoring is actually related to the RowFilter patch.
Perhaps I am mistaken, but IIUC this only changes the relid =
RelationGetRelid(rel); to be done inside this function instead of
being done outside by the callers.
It is not critical for this patch so I removed it.
12. src/backend/replication/pgoutput/pgoutput.c - missing function comments
The static functions create_estate_for_relation and
pgoutput_row_filter_prepare_expr probably should be commented.
Done.
13. src/backend/replication/pgoutput/pgoutput.c -
pgoutput_row_filter_prepare_expr function name+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode,
EState *estate);This function has an unfortunate name with the word "prepare" in it. I
wonder if a different name can be found for this function to avoid any
confusion with pgoutput functions (coming soon) which are related to
the two-phase commit "prepare".
The word "prepare" is related to the executor context. The function name
contains "row_filter" that is sufficient to distinguish it from any other
function whose context is "prepare". I replaced "prepare" with "init".
14. src/bin/psql/describe.c
+ if (!PQgetisnull(tabres, j, 2)) + appendPQExpBuffer(&buf, " WHERE (%s)", + PQgetvalue(tabres, j, 2));Because the where-clause value already has enclosing parentheses so
using " WHERE (%s)" seems overkill here. e.g. you can see the effect
in your src/test/regress/expected/publication.out file. I think this
should be changed to " WHERE %s" to give better output.
Peter E suggested that extra parenthesis be added. See 0005 [1]/messages/by-id/57373e8b-1264-cd37-404e-8edbcf7884cc@enterprisedb.com.
15. src/include/catalog/pg_publication.h - new typedef
+typedef struct PublicationRelationInfo +{ + Oid relid; + Relation relation; + Node *whereClause; +} PublicationRelationInfo; +The new PublicationRelationInfo should also be added
src/tools/pgindent/typedefs.list
Patches usually don't update typedefs.list. Check src/tools/pgindent/README.
16. src/include/nodes/parsenodes.h - new typedef
+typedef struct PublicationTable +{ + NodeTag type; + RangeVar *relation; /* relation to be published */ + Node *whereClause; /* qualifications */ +} PublicationTable;The new PublicationTable should also be added src/tools/pgindent/typedefs.list
Idem.
17. sql/publication.sql - show more output
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5); +RESET client_min_messages; +ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000); +ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2; +-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression) +ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500); +-- fail - functions disallowed +ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6); +-- fail - WHERE not allowed in DROP +ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27); +\dRp+ testpub5I felt that it would be better to have a "\dRp+ testpub5" after each
of the valid ALTER PUBLICATION steps to show the intermediate results
also; not just the final one at the end.
Done.
18. src/test/subscription/t/020_row_filter.pl - rename file
I think this file should be renamed to 021_row_filter.pl as there is
already an 020 TAP test present.
Done.
19. src/test/subscription/t/020_row_filter.pl - test comments
AFAIK the test cases are all OK, but it was really quite hard to
review these TAP tests to try to determine what the expected results
should be.
I included your comments but heavily changed it.
20. src/test/subscription/t/020_row_filter.pl - missing test case?
There are some partition tests, but I did not see any test that was
like 3 levels deep like A->B->C, so I was not sure if there is any
case C would ever make use of the filter of its parent B, or would it
only use the filter of the root A?
I didn't include it yet. There is an issue with initial synchronization and
partitioned table when you set publish_via_partition_root. I'll start another
thread for this issue.
21. src/test/subscription/t/020_row_filter.pl - missing test case?
If the same table is in multiple publications they can each have a row
filter. And a subscription might subscribe to some but not all of
those publications. I think this scenario is only partly tested.
8<
e.g.
pub_1 has tableX with RowFilter1
pub_2 has tableX with RowFilter2Then sub_12 subscribes to pub_1, pub_2
This is already tested in your TAP test (I think) and it makes sure
both filters are appliedBut if there was also
pub_3 has tableX with RowFilter3Then sub_12 still should only be checking the filtered RowFilter1 AND
RowFilter2 (but NOT row RowFilter3). I think this scenario is not
tested.
I added a new publication tap_pub_not_used to cover this case.
POC PATCH FOR PLAN CACHE
========================PSA a POC patch for a plan cache which gets used inside the
pgoutput_row_filter function instead of calling prepare for every row.
I think this is implementing something like Andes was suggesting a
while back [1].
I also had a WIP patch for it (that's very similar to your patch) so I merged
it.
This cache mechanism consists of caching ExprState and avoid calling
pgoutput_row_filter_init_expr() for every single row. Greg N suggested in
another email that tuple table slot should also be cached to avoid a few cycles
too. It is also included in this new patch.
Measurements with/without this plan cache:
Time spent processing within the pgoutput_row_filter function
- Data was captured using the same technique as the
0002-Measure-row-filter-overhead.patch.
- Inserted 1000 rows, sampled data for the first 100 times in this function.
not cached: average ~ 28.48 us
cached: average ~ 9.75 usReplication times:
- Using tables and row filters same as in Onder's commands_to_test_perf.sql [2]
100K rows - not cached: ~ 42sec, 43sec, 44sec
100K rows - cached: ~ 41sec, 42sec, 42 sec.There does seem to be a tiny gain achieved by having the plan cache,
but I think the gain might be a lot less than what people were
expecting.
I did another measure using as baseline the previous patch (v16).
without cache (v16)
---------------------------
mean: 1.46 us
stddev: 2.13 us
median: 1.39 us
min-max: [0.69 .. 1456.69] us
percentile(99): 3.15 us
mode: 0.91 us
with cache (v18)
-----------------------
mean: 0.63 us
stddev: 1.07 us
median: 0.55 us
min-max: [0.29 .. 844.87] us
percentile(99): 1.38 us
mode: 0.41 us
It represents -57%. It is a really good optimization for just a few extra lines
of code.
[1]: /messages/by-id/57373e8b-1264-cd37-404e-8edbcf7884cc@enterprisedb.com
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v18-0001-Row-filter-for-logical-replication.patchtext/x-patch; name=v18-0001-Row-filter-for-logical-replication.patchDownload
From 6176df1880ba690a7e5d550b7ee8c533dd33712e Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH v18 1/2] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy a WHERE clause may be
optionally filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table, which allows
different row filters to be defined for different tables. A new row
filter can be added simply by specifying a WHERE clause after the table
name. The WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 32 ++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 42 ++-
src/backend/commands/publicationcmds.c | 112 +++++---
src/backend/parser/gram.y | 24 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 +
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 94 +++++-
src/backend/replication/pgoutput/pgoutput.c | 268 +++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/021_row_filter.pl | 298 ++++++++++++++++++++
24 files changed, 1024 insertions(+), 80 deletions(-)
create mode 100644 src/test/subscription/t/021_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f517a7d4af..dbf2f46c00 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..ca091aae33 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..5c2b7d0bd2 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
</para>
<para>
@@ -182,6 +186,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +209,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +226,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..7183700ed9 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 15.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415af89..a15f00f637 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,23 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +215,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +237,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..12adcf1f36 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,7 +48,7 @@
/* Same as MAXNUMMESSAGES in sinvaladt.c */
#define MAX_RELCACHE_INVAL_MSGS 4096
-static List *OpenTableList(List *tables);
+static List *OpenTableList(List *tables, bool pub_drop_table);
static void CloseTableList(List *rels);
static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
AlterPublicationStmt *stmt);
@@ -232,7 +232,7 @@ CreatePublication(CreatePublicationStmt *stmt)
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, false);
PublicationAddTables(puboid, rels, true, NULL);
CloseTableList(rels);
}
@@ -361,6 +361,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ bool isdrop = (stmt->tableAction == DEFELEM_DROP);
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -372,7 +373,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, isdrop);
if (stmt->tableAction == DEFELEM_ADD)
PublicationAddTables(pubid, rels, false, stmt);
@@ -385,31 +386,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationInfo *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -500,26 +494,44 @@ RemovePublicationRelById(Oid proid)
/*
* Open relations specified by a RangeVar list.
+ *
+ * Publication node can have a different list element, hence, pub_drop_table
+ * indicates if it has a Relation (true) or PublicationTable (false).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
static List *
-OpenTableList(List *tables)
+OpenTableList(List *tables, bool pub_drop_table)
{
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (pub_drop_table)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +550,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (pub_drop_table)
+ pri->whereClause = NULL;
+ else
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +590,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (pub_drop_table)
+ pri->whereClause = NULL;
+ else
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -585,6 +611,9 @@ OpenTableList(List *tables)
/*
* Close all relations in the list.
+ *
+ * Publication node can have a different list element, hence, pub_drop_table
+ * indicates if it has a Relation (true) or PublicationTable (false).
*/
static void
CloseTableList(List *rels)
@@ -593,10 +622,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +643,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +675,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +688,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195438..d82ea003db 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb502..8fb953b54f 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32311..fc4170e723 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de7da..e946f17c64 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23afc..29f8835ce1 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 682c107e74..980826a502 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -691,19 +691,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -799,6 +803,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -812,6 +865,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -820,7 +874,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -847,8 +908,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index abd5217ab1..08c018a300 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -99,6 +108,9 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual; /* row filter */
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -122,7 +134,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -131,6 +143,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -520,6 +539,154 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but it is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->qual == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ if (entry->scantuple == NULL)
+ elog(DEBUG1, "entry->scantuple is null");
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter %smatched", result ? "" : "not ");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -547,7 +714,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -571,8 +738,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -580,6 +745,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -603,6 +778,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -631,6 +812,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -694,7 +881,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1005,9 +1192,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1030,6 +1218,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1041,6 +1232,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc;
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1054,6 +1246,23 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+
+ elog(DEBUG1, "get_rel_sync_entry: free entry->scantuple");
+ }
+
+ /* create a tuple table slot for row filter */
+ tupdesc = RelationGetDescr(relation);
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG1, "get_rel_sync_entry: allocate entry->scantuple");
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1063,6 +1272,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1122,9 +1334,34 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1242,6 +1479,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1251,6 +1489,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1268,5 +1508,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 321152151d..6f944ec60d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4172,6 +4172,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4182,9 +4183,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4193,6 +4201,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4233,6 +4242,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4265,8 +4278,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ba9bc6ddd2..7d72d498c1 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255798..e2e64cb3bf 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad4d4..2703b9c3fe 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504cbb..154bb61777 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417bcd7..2037705f45 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651b34..cf815cc0f2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2dd0..4537543a7b 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..444f8344bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..b1606cce7e 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/021_row_filter.pl b/src/test/subscription/t/021_row_filter.pl
new file mode 100644
index 0000000000..0f6d2f0128
--- /dev/null
+++ b/src/test/subscription/t/021_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
v18-0002-Measure-row-filter-overhead.patchtext/x-patch; name=v18-0002-Measure-row-filter-overhead.patchDownload
From b0d60791f06908d2dc118ff7f5dc669c91062913 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sun, 31 Jan 2021 20:48:43 -0300
Subject: [PATCH v18 2/2] Measure row filter overhead
---
src/backend/replication/pgoutput/pgoutput.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 08c018a300..5700a3306b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -638,6 +638,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ instr_time start_time;
+ instr_time end_time;
/* Bail out if there is no row filter */
if (entry->qual == NIL)
@@ -647,13 +649,12 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
get_rel_name(relation->rd_id));
+ INSTR_TIME_SET_CURRENT(start_time);
+
PushActiveSnapshot(GetTransactionSnapshot());
estate = create_estate_for_relation(relation);
- if (entry->scantuple == NULL)
- elog(DEBUG1, "entry->scantuple is null");
-
/* Prepare context per tuple */
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
@@ -684,6 +685,11 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
FreeExecutorState(estate);
PopActiveSnapshot();
+ INSTR_TIME_SET_CURRENT(end_time);
+ INSTR_TIME_SUBTRACT(end_time, start_time);
+
+ elog(DEBUG2, "row filter time: %0.3f us", INSTR_TIME_GET_DOUBLE(end_time) * 1e6);
+
return result;
}
@@ -1251,8 +1257,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
-
- elog(DEBUG1, "get_rel_sync_entry: free entry->scantuple");
}
/* create a tuple table slot for row filter */
@@ -1261,8 +1265,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
MemoryContextSwitchTo(oldctx);
- elog(DEBUG1, "get_rel_sync_entry: allocate entry->scantuple");
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
--
2.20.1
On Mon, Jul 5, 2021, at 12:14 AM, Greg Nancarrow wrote:
I have some review comments on the "Row filter for logical replication" patch:
(1) Suggested update to patch comment:
(There are some missing words and things which could be better expressed)
I incorporated all your wording suggestions.
(2) Some inconsistent error message wording:
Currently:
err = _("cannot use subquery in publication WHERE expression");Suggest changing it to:
err = _("subqueries are not allowed in publication WHERE expressions");
The same expression "cannot use subquery in ..." is used in the other switch
cases. If you think this message can be improved, I suggest that you submit a
separate patch to change all sentences.
Other examples from the patch:
err = _("aggregate functions are not allowed in publication WHERE expressions");
err = _("grouping operations are not allowed in publication WHERE expressions");
err = _("window functions are not allowed in publication WHERE expressions");
errmsg("functions are not allowed in publication WHERE expressions"),
err = _("set-returning functions are not allowed in publication WHERE
expressions");
This is a different function. I just followed the same wording from similar
sentences around it.
(3) The current code still allows arbitrary code execution, e.g. via a
user-defined operator:
I fixed it in v18.
Perhaps add the following after the existing shell error-check in make_op():
/* User-defined operators are not allowed in publication WHERE clauses */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication
WHERE expressions"),
parser_errposition(pstate, location)));
I'm still working on a way to accept built-in functions but while we don't have
it, let's forbid custom operators too.
Also, I believe it's also allowing user-defined CASTs (so could add a
similar check to above in transformTypeCast()).
Ideally, it would be preferable to validate/check publication WHERE
expressions in one central place, rather than scattered all over the
place, but that might be easier said than done.
You need to update the patch comment accordingly.
I forgot to mention it in the patch I sent a few minutes ago. I'm not sure we
need to mention every error condition (specially one that will be rarely used).
(4) src/backend/replication/pgoutput/pgoutput.c
pgoutput_change()The 3 added calls to pgoutput_row_filter() are returning from
pgoutput_change(), if false is returned, but instead they should break
from the switch, otherwise cleanup code is missed. This is surely a
bug.
Fixed.
In summary, v18 contains
* Peter Smith's review
* Greg Nancarrow's review
* cache ExprState
* cache TupleTableSlot
* forbid custom operators
* various fixes
--
Euler Taveira
EDB https://www.enterprisedb.com/
Hi,
I took a look at this patch, which seems to be in CF since 2018. I have
only some basic comments and observations at this point:
1) alter_publication.sgml
I think "expression is executed" sounds a bit strange, perhaps
"evaluated" would be better?
2) create_publication.sgml
Why is the patch changing publish_via_partition_root docs? That seems
like a rather unrelated bit.
The <literal>WHERE</literal> clause should probably contain only
columns that are part of the primary key or be covered by
<literal>REPLICA ...
I'm not sure what exactly is this trying to say. What does "should
probably ..." mean in practice for the users? Does that mean something
bad will happen for other columns, or what? I'm sure this wording will
be quite confusing for users.
It may also be unclear whether the condition is evaluated on the old or
new row, so perhaps add an example illustrating that & more detailed
comment, or something. E.g. what will happen with
UPDATE departments SET active = false WHERE active;
3) publication_add_relation
Does this need to build the parse state even for whereClause == NULL?
4) AlterPublicationTables
I wonder if this new reworked code might have issues with subscriptions
containing many tables, but I haven't tried.
5) OpenTableList
I really dislike that the list can have two different node types
(Relation and PublicationTable). In principle we don't actually need the
extra flag, we can simply check the node type directly by IsA() and act
based on that. However, I think it'd be better to just use a single node
type from all places.
I don't see why not to set whereClause every time, I don't think the
extra if saves anything, it's just a bit more complex.
5) CloseTableList
The comment about node types seems pointless, this function has no flag
and the element type does not matter.
6) parse_agg.c
... are not allowed in publication WHERE expressions
I think all similar cases use "WHERE conditions" instead.
7) transformExprRecurse
The check at the beginning seems rather awkward / misplaced - it's way
too specific for this location (there are no other p_expr_kind
references in this function). Wouldn't transformFuncCall (or maybe
ParseFuncOrColumn) be a more appropriate place?
Initially I was wondering why not to allow function calls in WHERE
conditions, but I see that was discussed in the past as problematic. But
that reminds me that I don't see any docs describing what expressions
are allowed in WHERE conditions - maybe we should explicitly list what
expressions are allowed?
8) pgoutput.c
I have not reviewed this in detail yet, but there seems to be something
wrong because `make check-world` fails in subscription/010_truncate.pl
after hitting an assert (backtrace attached) during "START_REPLICATION
SLOT" in get_rel_sync_entry in this code:
/* Release tuple table slot */
if (entry->scantuple != NULL)
{
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
So there seems to be something wrong with how the slot is created.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
On Sun, Jul 11, 2021, at 4:39 PM, Euler Taveira wrote:
with cache (v18)
-----------------------mean: 0.63 us
stddev: 1.07 us
median: 0.55 us
min-max: [0.29 .. 844.87] us
percentile(99): 1.38 us
mode: 0.41 usIt represents -57%. It is a really good optimization for just a few extra lines
of code.
cfbot seems to be unhappy with v18 on some of the hosts. Cirrus/FreeBSD failed
in the test 010_truncate. It also failed in a Cirrus/Linux box. I failed to
reproduce in my local FreeBSD box. Since it passes appveyor and Cirrus/macos,
it could probably be a transient issue.
$ uname -a
FreeBSD freebsd12 12.2-RELEASE FreeBSD 12.2-RELEASE r366954 GENERIC amd64
$ PROVE_TESTS="t/010_truncate.pl" gmake check
gmake -C ../../../src/backend generated-headers
gmake[1]: Entering directory '/usr/home/euler/pglr-row-filter-v17/src/backend'
gmake -C catalog distprep generated-header-symlinks
gmake[2]: Entering directory '/usr/home/euler/pglr-row-filter-v17/src/backend/catalog'
gmake[2]: Nothing to be done for 'distprep'.
gmake[2]: Nothing to be done for 'generated-header-symlinks'.
gmake[2]: Leaving directory '/usr/home/euler/pglr-row-filter-v17/src/backend/catalog'
gmake -C utils distprep generated-header-symlinks
gmake[2]: Entering directory '/usr/home/euler/pglr-row-filter-v17/src/backend/utils'
gmake[2]: Nothing to be done for 'distprep'.
gmake[2]: Nothing to be done for 'generated-header-symlinks'.
gmake[2]: Leaving directory '/usr/home/euler/pglr-row-filter-v17/src/backend/utils'
gmake[1]: Leaving directory '/usr/home/euler/pglr-row-filter-v17/src/backend'
rm -rf '/home/euler/pglr-row-filter-v17'/tmp_install
/bin/sh ../../../config/install-sh -c -d '/home/euler/pglr-row-filter-v17'/tmp_install/log
gmake -C '../../..' DESTDIR='/home/euler/pglr-row-filter-v17'/tmp_install install >'/home/euler/pglr-row-filter-v17'/tmp_install/log/install.log 2>&1
gmake -j1 checkprep >>'/home/euler/pglr-row-filter-v17'/tmp_install/log/install.log 2>&1
rm -rf '/usr/home/euler/pglr-row-filter-v17/src/test/subscription'/tmp_check
/bin/sh ../../../config/install-sh -c -d '/usr/home/euler/pglr-row-filter-v17/src/test/subscription'/tmp_check
cd . && TESTDIR='/usr/home/euler/pglr-row-filter-v17/src/test/subscription' PATH="/home/euler/pglr-row-filter-v17/tmp_install/home/euler/pgrf18/bin:$PATH" LD_LIBRARY_PATH="/home/euler/pglr-row-filter-v17/tmp_install/home/euler/pgrf18/lib" LD_LIBRARY_PATH_RPATH=1 PGPORT='69999' PG_REGRESS='/usr/home/euler/pglr-row-filter-v17/src/test/subscription/../../../src/test/regress/pg_regress' /usr/local/bin/prove -I ../../../src/test/perl/ -I . t/010_truncate.pl
t/010_truncate.pl .. ok
All tests successful.
Files=1, Tests=14, 5 wallclock secs ( 0.02 usr 0.00 sys + 1.09 cusr 0.99 csys = 2.10 CPU)
Result: PASS
--
Euler Taveira
EDB https://www.enterprisedb.com/
Hi
Andres complained about the safety of doing general expression
evaluation in pgoutput; that was first in
/messages/by-id/20210128022032.eq2qqc6zxkqn5syt@alap3.anarazel.de
where he described a possible approach to handle it by restricting
expressions to have limited shape; and later in
/messages/by-id/20210331191710.kqbiwe73lur7jo2e@alap3.anarazel.de
I was just scanning the patch trying to see if some sort of protection
had been added for this, but I couldn't find anything. (Some functions
are under-commented, though). So, is it there already, and if so what
is it? And if it isn't, then I think it should definitely be put there
in some form.
Thanks
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
On Mon, Jul 12, 2021 at 9:31 AM Euler Taveira <euler@eulerto.com> wrote:
cfbot seems to be unhappy with v18 on some of the hosts. Cirrus/FreeBSD failed
in the test 010_truncate. It also failed in a Cirrus/Linux box. I failed to
reproduce in my local FreeBSD box. Since it passes appveyor and Cirrus/macos,
it could probably be a transient issue.
I don't think it's a transient issue.
I also get a test failure in subscription/010_truncate.pl when I run
"make check-world" with the v18 patches applied.
The problem can be avoided with the following change (to match what
was originally in my v17-0005 performance-improvement patch):
diff --git a/src/backend/replication/pgoutput/pgoutput.c
b/src/backend/replication/pgoutput/pgoutput.c
index 08c018a300..800bae400b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1256,8 +1256,8 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
}
/* create a tuple table slot for row filter */
- tupdesc = RelationGetDescr(relation);
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(RelationGetDescr(relation));
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
MemoryContextSwitchTo(oldctx);
This creates a TupleDesc copy in CacheMemoryContext that is not
refcounted, so it side-steps the problem.
At this stage I am not sure why the original v18 patch code doesn't
work correctly for the TupleDesc refcounting here.
The TupleDesc refcount is zero when it's time to dealloc the tuple
slot (thus causing that Assert to fire), yet when the slot was
created, the TupleDesc refcount was incremented.- so it seems
something else has already decremented the refcount by the time it
comes to deallocate the slot. Perhaps there's an order-of-cleanup or
MemoryContext issue here or some buggy code somewhere, not sure yet.
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Jul 12, 2021 at 7:19 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Hi
Andres complained about the safety of doing general expression
evaluation in pgoutput; that was first in/messages/by-id/20210128022032.eq2qqc6zxkqn5syt@alap3.anarazel.de
where he described a possible approach to handle it by restricting
expressions to have limited shape; and later in
/messages/by-id/20210331191710.kqbiwe73lur7jo2e@alap3.anarazel.deI was just scanning the patch trying to see if some sort of protection
had been added for this, but I couldn't find anything. (Some functions
are under-commented, though). So, is it there already, and if so what
is it?
I think the patch is trying to prohibit arbitrary expressions in the
WHERE clause via
transformWhereClause(..EXPR_KIND_PUBLICATION_WHERE..). You can notice
that at various places the expressions are prohibited via
EXPR_KIND_PUBLICATION_WHERE. I am not sure that the checks are correct
and sufficient but I think there is some attempt to do it. For
example, the below sort of ad-hoc check for func_call doesn't seem to
be good idea.
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE &&
nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
Now, the other idea I had in mind was to traverse the WHERE clause
expression in publication_add_relation and identify if it contains
anything other than the ANDed list of 'foo.bar op constant'
expressions. OTOH, for index where clause expressions or policy check
expressions, we use a technique similar to what we have in the patch
to prohibit certain kinds of expressions.
Do you have any preference on how this should be addressed?
--
With Regards,
Amit Kapila.
On 7/12/21 6:46 AM, Amit Kapila wrote:
On Mon, Jul 12, 2021 at 7:19 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Hi
Andres complained about the safety of doing general expression
evaluation in pgoutput; that was first in/messages/by-id/20210128022032.eq2qqc6zxkqn5syt@alap3.anarazel.de
where he described a possible approach to handle it by restricting
expressions to have limited shape; and later in
/messages/by-id/20210331191710.kqbiwe73lur7jo2e@alap3.anarazel.deI was just scanning the patch trying to see if some sort of protection
had been added for this, but I couldn't find anything. (Some functions
are under-commented, though). So, is it there already, and if so what
is it?I think the patch is trying to prohibit arbitrary expressions in the
WHERE clause via
transformWhereClause(..EXPR_KIND_PUBLICATION_WHERE..). You can notice
that at various places the expressions are prohibited via
EXPR_KIND_PUBLICATION_WHERE. I am not sure that the checks are correct
and sufficient but I think there is some attempt to do it. For
example, the below sort of ad-hoc check for func_call doesn't seem to
be good idea.@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();+ /* Functions are not allowed in publication WHERE clauses */ + if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("functions are not allowed in publication WHERE expressions"), + parser_errposition(pstate, exprLocation(expr))));
Yes, I mentioned this bit of code in my review, although I was mostly
wondering if this is the wrong place to make this check.
Now, the other idea I had in mind was to traverse the WHERE clause
expression in publication_add_relation and identify if it contains
anything other than the ANDed list of 'foo.bar op constant'
expressions. OTOH, for index where clause expressions or policy check
expressions, we use a technique similar to what we have in the patch
to prohibit certain kinds of expressions.Do you have any preference on how this should be addressed?
I don't think this is sufficient, because who knows where "op" comes
from? It might be from an extension, in which case the problem pointed
out by Petr Jelinek [1]/messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com would apply. OTOH I suppose we could allow
expressions like (Var op Var), i.e. "a < b" or something like that. And
then why not allow (a+b < c-10) and similar "more complex" expressions,
as long as all the operators are built-in?
In terms of implementation, I think there are two basic options - either
we can define a new "expression" type in gram.y, which would be a subset
of a_expr etc. Or we can do it as some sort of expression walker, kinda
like what the transform* functions do now.
regards
[1]: /messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com
/messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Jul 12, 2021 at 1:09 AM Euler Taveira <euler@eulerto.com> wrote:
I did another measure using as baseline the previous patch (v16).
without cache (v16)
---------------------------mean: 1.46 us
stddev: 2.13 us
median: 1.39 us
min-max: [0.69 .. 1456.69] us
percentile(99): 3.15 us
mode: 0.91 uswith cache (v18)
-----------------------mean: 0.63 us
stddev: 1.07 us
median: 0.55 us
min-max: [0.29 .. 844.87] us
percentile(99): 1.38 us
mode: 0.41 usIt represents -57%. It is a really good optimization for just a few extra lines
of code.
Good improvement but I think it is better to measure the performance
by using synchronous_replication by setting the subscriber as
standby_synchronous_names, which will provide the overall saving of
time. We can probably see when the timings when no rows are filtered,
when 10% rows are filtered when 30% are filtered and so on.
I think the way caching has been done in the patch is a bit
inefficient. Basically, it always invalidates and rebuilds the
expressions even though some unrelated operation has happened on
publication. For example, say publication has initially table t1 with
rowfilter r1 for which we have cached the state. Now you altered
publication and added table t2, it will invalidate the entire state of
t1 as well. I think we can avoid that if we invalidate the rowfilter
related state only on relcache invalidation i.e in
rel_sync_cache_relation_cb and save it the very first time we prepare
the expression. In that case, we don't need to do it in advance when
preparing relsyncentry, this will have the additional advantage that
we won't spend cycles on preparing state unless it is required (for
truncate we won't require row_filtering, so it won't be prepared).
Few other things, I have noticed:
1.
I am seeing tupledesc leak by following below steps:
ERROR: tupdesc reference 00000000008D7D18 is not owned by resource
owner TopTransaction
CONTEXT: slot "tap_sub", output plugin "pgoutput", in the change
callback, associated LSN 0/170BD50
Publisher
CREATE TABLE tab_rowfilter_1 (a int primary key, b text);
CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000
AND b <> 'filtered');
Subscriber
CREATE TABLE tab_rowfilter_1 (a int primary key, b text);
CREATE SUBSCRIPTION tap_sub
CONNECTION 'host=localhost port=5432 dbname=postgres'
PUBLICATION tap_pub_1;
Publisher
INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered');
Alter table tab_rowfilter_1 drop column b cascade;
INSERT INTO tab_rowfilter_1 (a) VALUES (1982);
2.
postgres=# Alter table tab_rowfilter_1 alter column b set data type varchar;
ERROR: unexpected object depending on column: publication of table
tab_rowfilter_1 in publication tap_pub_1
I think for this you need to change ATExecAlterColumnType to handle
the publication case.
--
With Regards,
Amit Kapila.
While looking at the other logrep patch [1]/messages/by-id/202107062342.eq6htmp2wgp2@alvherre.pgsql (column filtering) I noticed
Alvaro's comment regarding a new parsenode (PublicationTable) not having
read/out/equal/copy funcs. I'd bet the same thing applies here, so
perhaps see if the patch needs the same fix.
[1]: /messages/by-id/202107062342.eq6htmp2wgp2@alvherre.pgsql
/messages/by-id/202107062342.eq6htmp2wgp2@alvherre.pgsql
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Jul 12, 2021, at 8:44 AM, Tomas Vondra wrote:
While looking at the other logrep patch [1] (column filtering) I noticed
Alvaro's comment regarding a new parsenode (PublicationTable) not having
read/out/equal/copy funcs. I'd bet the same thing applies here, so
perhaps see if the patch needs the same fix.
Good catch! I completely forgot about _copyPublicationTable() and
_equalPublicationTable().
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Jul 12, 2021 at 5:39 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Jul 2, 2021, at 4:29 AM, Peter Smith wrote:
Hi.
I have been looking at the latest patch set (v16). Below are my review
comments and some patches.Peter, thanks for your detailed review. Comments are inline.
Hi Euler,
Thanks for addressing my previous review comments.
I have reviewed the latest v18 patch. Below are some more review
comments and patches.
(The patches 0003,0004 are just examples of what is mentioned in my
comments; The patches 0001,0002 are there only to try to keep cfbot
green).
//////////
1. Commit comment - wording
"When a publication is defined or modified, rows that don't satisfy a
WHERE clause may be
optionally filtered out."
=>
I think this means to say: "Rows that don't satisfy an optional WHERE
clause will be filtered out."
------
2. Commit comment - wording
"The row filter is per table, which allows different row filters to be
defined for different tables."
=>
I think all that is the same as just saying: "The row filter is per table."
------
3. PG docs - independent improvement
You wrote (ref [1]/messages/by-id/532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com point 3):
"I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the
root partitioned table is used. We should improve that sentence too."
I agree, but that PG docs improvement is independent of your RowFilter
patch; please make another thread for that idea.
------
4. doc/src/sgml/ref/create_publication.sgml - independent improvement
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable
class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
</para>
I think that Tomas wrote (ref [2]/messages/by-id/849ee491-bba3-c0ae-cc25-4fce1c03f105@enterprisedb.com point 2) that this change seems
unrelated to your RowFilter patch.
I agree; I liked the change, but IMO you need to propose this one in
another thread too.
------
5. doc/src/sgml/ref/create_subscription.sgml - wording
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable
class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 15.
I felt that the sentence: "If any table in the publications has a
<literal>WHERE</literal> clause, data synchronization does not use it
if the subscriber is a <productname>PostgreSQL</productname> version
before 15."
Could be expressed more simply like: "If the subscriber is a
<productname>PostgreSQL</productname> version before 15 then any row
filtering is ignored."
------
6. src/backend/commands/publicationcmds.c - wrong function comment
@@ -585,6 +611,9 @@ OpenTableList(List *tables)
/*
* Close all relations in the list.
+ *
+ * Publication node can have a different list element, hence, pub_drop_table
+ * indicates if it has a Relation (true) or PublicationTable (false).
*/
static void
CloseTableList(List *rels)
=>
The 2nd parameter does not exist in v18, so that comment about
pub_drop_table seems to be a cut/paste error from the OpenTableList.
------
src/backend/replication/logical/tablesync.c - bug ?
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
=>
I did not understand the above call to make_copy_attnamelist. The
result seems unused before it is overwritten later in this same
function (??)
------
7. src/backend/replication/logical/tablesync.c -
fetch_remote_table_info enhancement
+ /* Get relation qual */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
=>
I think a small improvement is possible in this SQL.
If we change that to "SELECT DISTINCT pg_get_expr(prqual, prrelid)"...
then it avoids the copy SQL from having multiple WHERE clauses which
are all identical. This could happen when subscribed to multiple
publications which had the same filter for the same table.
I attached a tmp POC patch for this change and it works as expected.
For example, I subscribe to 3 publications, but 2 of them have the
same filter for the table.
BEFORE
COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND
(key > 1000) AND (key > 1000)) TO STDOUT
AFTER
COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND
(key > 1000) ) TO STDOUT
------
8. src/backend/replication/pgoutput/pgoutput.c - qual member is redundant
@@ -99,6 +108,9 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual; /* row filter */
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
=>
Now that the exprstate is introduced I think that the other member
"qual" is redundant, so it can be removed.
FYI - I attached a tmp patch with all the qual references deleted and
everything is fine.
------
9. src/backend/replication/pgoutput/pgoutput.c - comment typo?
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but it is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
=>
typo: it/that ?
I think it ought to say "This is the same code as ExecPrepareExpr()
but that is not used because"...
------
10. src/backend/replication/pgoutput/pgoutput.c - redundant debug logging?
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter %smatched", result ? "" : "not ");
The above debug logging is really only a repeat (with different
wording) of the same information already being logged inside the
pgoutput_row_filter_exec_expr function isn't it? Consider removing the
redundant logging.
e.g. This is already getting logged by pgoutput_row_filter_exec_expr:
elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
DatumGetBool(ret) ? "true" : "false",
isnull ? "true" : "false");
------
[1]: /messages/by-id/532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com
[2]: /messages/by-id/849ee491-bba3-c0ae-cc25-4fce1c03f105@enterprisedb.com
[3]: /messages/by-id/532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v18-0002-tmp-Gregs-hack-to-avoid-error.patchapplication/octet-stream; name=v18-0002-tmp-Gregs-hack-to-avoid-error.patchDownload
From 7464bf300902f8b042edf302e192809ea4deaf15 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 13 Jul 2021 10:50:35 +1000
Subject: [PATCH v18] tmp - Gregs hack to avoid error
---
src/backend/replication/pgoutput/pgoutput.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 08c018a..352dd4c 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1256,8 +1256,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
}
/* create a tuple table slot for row filter */
- tupdesc = RelationGetDescr(relation);
+ //tupdesc = RelationGetDescr(relation);
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(RelationGetDescr(relation));
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v18-0003-tmp-the-RelationSynEntry-qual-is-redundant.patchapplication/octet-stream; name=v18-0003-tmp-the-RelationSynEntry-qual-is-redundant.patchDownload
From 8da5bcb281868f5b4c603bdc4a6f9d81fb32bfe0 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 13 Jul 2021 10:59:23 +1000
Subject: [PATCH v18] tmp - the RelationSynEntry "qual" is redundant.
Now that the exprstate member is introduce the other qual member is not longer needed.
All code using it can be removed.
---
src/backend/replication/pgoutput/pgoutput.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 352dd4c..cdf1521 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -108,7 +108,6 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *qual; /* row filter */
List *exprstate; /* ExprState for row filter */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
@@ -640,7 +639,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
bool result = true;
/* Bail out if there is no row filter */
- if (entry->qual == NIL)
+ if (entry->exprstate == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
@@ -1218,7 +1217,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
- entry->qual = NIL;
entry->scantuple = NULL;
entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
@@ -1353,7 +1351,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
rfnode = stringToNode(TextDatumGetCString(rfdatum));
- entry->qual = lappend(entry->qual, rfnode);
/* Prepare for expression execution */
exprstate = pgoutput_row_filter_init_expr(rfnode);
@@ -1510,10 +1507,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
- if (entry->qual != NIL)
- list_free_deep(entry->qual);
- entry->qual = NIL;
-
if (entry->exprstate != NIL)
list_free_deep(entry->exprstate);
entry->exprstate = NIL;
--
1.8.3.1
v18-0004-tmp-Initial-copy_data-to-use-only-DISTINCT-filte.patchapplication/octet-stream; name=v18-0004-tmp-Initial-copy_data-to-use-only-DISTINCT-filte.patchDownload
From 28bc549b8895931d6fa5a0083bedc3fbdcc1cbbd Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 13 Jul 2021 12:48:29 +1000
Subject: [PATCH v18] tmp - Initial copy_data to use only DISTINCT filters.
If subscriber subscribes to multiple publications, and if those publications have filters for the same table, then when discovering all those filters we should only care if they are DISTINCT.
For example, there is no point to apply the same identical WHERE clause multiple times to the COPY command.
---
src/backend/replication/logical/tablesync.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 980826a..efe3ce8 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -804,12 +804,15 @@ fetch_remote_table_info(char *nspname, char *relname,
walrcv_clear_result(res);
- /* Get relation qual */
+ /*
+ * Get relation qual. Use SELECT DISTINCT because there is no point to apply
+ * identical filters multiple times.
+ */
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
{
resetStringInfo(&cmd);
appendStringInfo(&cmd,
- "SELECT pg_get_expr(prqual, prrelid) "
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
" FROM pg_publication p "
" INNER JOIN pg_publication_rel pr "
" ON (p.oid = pr.prpubid) "
@@ -830,6 +833,8 @@ fetch_remote_table_info(char *nspname, char *relname,
}
appendStringInfoChar(&cmd, ')');
+ elog(LOG, "!!> fetch_remote_table_info: SQL:\n%s", cmd.data);
+
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
if (res->status != WALRCV_OK_TUPLES)
@@ -932,6 +937,9 @@ copy_table(Relation rel)
appendStringInfoString(&cmd, ") TO STDOUT");
}
+
+ elog(LOG, "!!> copy_table SQL:\n%s", cmd.data);
+
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
if (res->status != WALRCV_OK_COPY_OUT)
--
1.8.3.1
v18-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v18-0001-Row-filter-for-logical-replication.patchDownload
From 9f10cac6c2e9aa0e8393d069c115944ea250c6cd Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 12 Jul 2021 19:13:45 +1000
Subject: [PATCH v18] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy a WHERE clause may be
optionally filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table, which allows
different row filters to be defined for different tables. A new row
filter can be added simply by specifying a WHERE clause after the table
name. The WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 32 ++-
doc/src/sgml/ref/create_subscription.sgml | 11 +-
src/backend/catalog/pg_publication.c | 42 +++-
src/backend/commands/publicationcmds.c | 114 +++++++----
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 94 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 268 ++++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/021_row_filter.pl | 298 ++++++++++++++++++++++++++++
24 files changed, 1025 insertions(+), 81 deletions(-)
create mode 100644 src/test/subscription/t/021_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f517a7d..dbf2f46 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..ca091aa 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is executed with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..5c2b7d0 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
- latter is the default. Enabling this allows the changes to be
- replicated into a non-partitioned table or a partitioned table
- consisting of a different set of partitions.
+ latter is the default (<literal>false</literal>). Enabling this
+ allows the changes to be replicated into a non-partitioned table or a
+ partitioned table consisting of a different set of partitions.
</para>
<para>
@@ -183,6 +187,14 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should probably contain only columns
+ that are part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. For <command>INSERT</command> and <command>UPDATE</command>
+ operations, any column can be used in the <literal>WHERE</literal> clause.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +209,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +227,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812bee..7183700 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If any table in the publications has a
+ <literal>WHERE</literal> clause, data synchronization does not use it
+ if the subscriber is a <productname>PostgreSQL</productname> version
+ before 15.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415a..a15f00f 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,23 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +215,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +237,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c..12adcf1 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,7 +48,7 @@
/* Same as MAXNUMMESSAGES in sinvaladt.c */
#define MAX_RELCACHE_INVAL_MSGS 4096
-static List *OpenTableList(List *tables);
+static List *OpenTableList(List *tables, bool pub_drop_table);
static void CloseTableList(List *rels);
static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
AlterPublicationStmt *stmt);
@@ -232,7 +232,7 @@ CreatePublication(CreatePublicationStmt *stmt)
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, false);
PublicationAddTables(puboid, rels, true, NULL);
CloseTableList(rels);
}
@@ -361,6 +361,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ bool isdrop = (stmt->tableAction == DEFELEM_DROP);
/* Check that user is allowed to manipulate the publication tables. */
if (pubform->puballtables)
@@ -372,7 +373,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
Assert(list_length(stmt->tables) > 0);
- rels = OpenTableList(stmt->tables);
+ rels = OpenTableList(stmt->tables, isdrop);
if (stmt->tableAction == DEFELEM_ADD)
PublicationAddTables(pubid, rels, false, stmt);
@@ -385,31 +386,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -500,26 +494,44 @@ RemovePublicationRelById(Oid proid)
/*
* Open relations specified by a RangeVar list.
+ *
+ * Publication node can have a different list element, hence, pub_drop_table
+ * indicates if it has a Relation (true) or PublicationTable (false).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
static List *
-OpenTableList(List *tables)
+OpenTableList(List *tables, bool pub_drop_table)
{
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ if (pub_drop_table)
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+ else
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+
+ recurse = rv->inh;
+
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +550,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (pub_drop_table)
+ pri->whereClause = NULL;
+ else
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +590,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (pub_drop_table)
+ pri->whereClause = NULL;
+ else
+ pri->whereClause = t->whereClause;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -585,6 +611,9 @@ OpenTableList(List *tables)
/*
* Close all relations in the list.
+ *
+ * Publication node can have a different list element, hence, pub_drop_table
+ * indicates if it has a Relation (true) or PublicationTable (false).
*/
static void
CloseTableList(List *rels)
@@ -593,10 +622,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +643,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +675,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +688,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195..d82ea00 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb..8fb953b 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 682c107..980826a 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -691,19 +691,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -799,6 +803,55 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /* Get relation qual */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -812,6 +865,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -820,7 +874,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);
+ /* List of columns for COPY */
+ attnamelist = make_copy_attnamelist(relmapentry);
+
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -847,8 +908,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index abd5217..08c018a 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -99,6 +108,9 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *qual; /* row filter */
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -122,7 +134,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -131,6 +143,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -521,6 +540,154 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but it is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->qual == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ if (entry->scantuple == NULL)
+ elog(DEBUG1, "entry->scantuple is null");
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ elog(DEBUG3, "row filter %smatched", result ? "" : "not ");
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -547,7 +714,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -571,8 +738,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -580,6 +745,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -603,6 +778,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -631,6 +812,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -694,7 +881,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1005,9 +1192,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1030,6 +1218,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->qual = NIL;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1041,6 +1232,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc;
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1054,6 +1246,23 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+
+ elog(DEBUG1, "get_rel_sync_entry: free entry->scantuple");
+ }
+
+ /* create a tuple table slot for row filter */
+ tupdesc = RelationGetDescr(relation);
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ elog(DEBUG1, "get_rel_sync_entry: allocate entry->scantuple");
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1063,6 +1272,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1122,9 +1334,34 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->qual = lappend(entry->qual, rfnode);
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1242,6 +1479,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1251,6 +1489,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1268,5 +1508,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->qual != NIL)
+ list_free_deep(entry->qual);
+ entry->qual = NIL;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3211521..6f944ec 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4172,6 +4172,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4182,9 +4183,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4193,6 +4201,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4233,6 +4242,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4265,8 +4278,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ba9bc6d..7d72d49 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255..e2e64cb 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417b..2037705 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651..cf815cc 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7..444f834 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/021_row_filter.pl b/src/test/subscription/t/021_row_filter.pl
new file mode 100644
index 0000000..0f6d2f0
--- /dev/null
+++ b/src/test/subscription/t/021_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Mon, Jul 12, 2021 at 3:01 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/12/21 6:46 AM, Amit Kapila wrote:
On Mon, Jul 12, 2021 at 7:19 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Now, the other idea I had in mind was to traverse the WHERE clause
expression in publication_add_relation and identify if it contains
anything other than the ANDed list of 'foo.bar op constant'
expressions. OTOH, for index where clause expressions or policy check
expressions, we use a technique similar to what we have in the patch
to prohibit certain kinds of expressions.Do you have any preference on how this should be addressed?
I don't think this is sufficient, because who knows where "op" comes
from? It might be from an extension, in which case the problem pointed
out by Petr Jelinek [1] would apply. OTOH I suppose we could allow
expressions like (Var op Var), i.e. "a < b" or something like that. And
then why not allow (a+b < c-10) and similar "more complex" expressions,
as long as all the operators are built-in?
Yeah, and the patch already disallows the user-defined operators in
filters. I think ideally if the operator doesn't refer to UDFs, we can
allow to directly use such an OP in the filter as we can add a
dependency for the same.
In terms of implementation, I think there are two basic options - either
we can define a new "expression" type in gram.y, which would be a subset
of a_expr etc. Or we can do it as some sort of expression walker, kinda
like what the transform* functions do now.
I think it is better to use some form of walker here rather than
extending the grammar for this. However, the question is do we need
some special kind of expression walker here or can we handle all
required cases via transformWhereClause() call as the patch is trying
to do. AFAIU, the main things we want to prohibit in the filter are:
(a) it doesn't refer to any relation other than catalog in where
clause, (b) it doesn't use UDFs in any way (in expressions, in
user-defined operators, user-defined types, etc.), (c) the columns
referred to in the filter should be part of PK or Replica Identity.
Now, if all such things can be detected by the approach patch has
taken then why do we need a special kind of expression walker? OTOH,
if we can't detect some of this then probably we can use a special
walker.
I think in the long run one idea to allow UDFs is probably by
explicitly allowing users to specify whether the function is
publication predicate safe and if so, then we can allow such functions
in the filter clause.
--
With Regards,
Amit Kapila.
Hi Euler,
Greg noticed that your patch set was missing any implementation of the
psql tab auto-complete for the new row filter WHERE syntax.
So I have added a POC patch for this missing feature.
Unfortunately, there is an existing HEAD problem overlapping with this
exact same code. I reported this already in another thread [1]/messages/by-id/CAHut+Ps-vkmnWAShWSRVCB3gx8aM=bFoDqWgBNTzofK0q1LpwA@mail.gmail.com.
So there are 2 patches attached here:
0001 - Fixes the other reported problem (I hope this may be pushed soon)
0002 - Adds the tab-completion code for your row filter WHERE's
------
[1]: /messages/by-id/CAHut+Ps-vkmnWAShWSRVCB3gx8aM=bFoDqWgBNTzofK0q1LpwA@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Show quoted text
On Tue, Jul 13, 2021 at 1:25 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Jul 12, 2021 at 5:39 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Jul 2, 2021, at 4:29 AM, Peter Smith wrote:
Hi.
I have been looking at the latest patch set (v16). Below are my review
comments and some patches.Peter, thanks for your detailed review. Comments are inline.
Hi Euler,
Thanks for addressing my previous review comments.
I have reviewed the latest v18 patch. Below are some more review
comments and patches.(The patches 0003,0004 are just examples of what is mentioned in my
comments; The patches 0001,0002 are there only to try to keep cfbot
green).//////////
1. Commit comment - wording
"When a publication is defined or modified, rows that don't satisfy a
WHERE clause may be
optionally filtered out."=>
I think this means to say: "Rows that don't satisfy an optional WHERE
clause will be filtered out."------
2. Commit comment - wording
"The row filter is per table, which allows different row filters to be
defined for different tables."=>
I think all that is the same as just saying: "The row filter is per table."
------
3. PG docs - independent improvement
You wrote (ref [1] point 3):
"I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the
root partitioned table is used. We should improve that sentence too."I agree, but that PG docs improvement is independent of your RowFilter
patch; please make another thread for that idea.------
4. doc/src/sgml/ref/create_publication.sgml - independent improvement
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> on its partitions) contained in the publication will be published using the identity and schema of the partitioned table rather than that of the individual partitions that are actually changed; the - latter is the default. Enabling this allows the changes to be - replicated into a non-partitioned table or a partitioned table - consisting of a different set of partitions. + latter is the default (<literal>false</literal>). Enabling this + allows the changes to be replicated into a non-partitioned table or a + partitioned table consisting of a different set of partitions. </para>I think that Tomas wrote (ref [2] point 2) that this change seems
unrelated to your RowFilter patch.I agree; I liked the change, but IMO you need to propose this one in
another thread too.------
5. doc/src/sgml/ref/create_subscription.sgml - wording
@@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl <para> Specifies whether the existing data in the publications that are being subscribed to should be copied once the replication starts. - The default is <literal>true</literal>. + The default is <literal>true</literal>. If any table in the + publications has a <literal>WHERE</literal> clause, rows that do not + satisfy the <replaceable class="parameter">expression</replaceable> + will not be copied. If the subscription has several publications in + which a table has been published with different + <literal>WHERE</literal> clauses, rows must satisfy all expressions + to be copied. If any table in the publications has a + <literal>WHERE</literal> clause, data synchronization does not use it + if the subscriber is a <productname>PostgreSQL</productname> version + before 15.I felt that the sentence: "If any table in the publications has a
<literal>WHERE</literal> clause, data synchronization does not use it
if the subscriber is a <productname>PostgreSQL</productname> version
before 15."Could be expressed more simply like: "If the subscriber is a
<productname>PostgreSQL</productname> version before 15 then any row
filtering is ignored."------
6. src/backend/commands/publicationcmds.c - wrong function comment
@@ -585,6 +611,9 @@ OpenTableList(List *tables)
/* * Close all relations in the list. + * + * Publication node can have a different list element, hence, pub_drop_table + * indicates if it has a Relation (true) or PublicationTable (false). */ static void CloseTableList(List *rels)=>
The 2nd parameter does not exist in v18, so that comment about
pub_drop_table seems to be a cut/paste error from the OpenTableList.------
src/backend/replication/logical/tablesync.c - bug ?
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);+ /* List of columns for COPY */ + attnamelist = make_copy_attnamelist(relmapentry); + /* Start copy on the publisher. */ =>I did not understand the above call to make_copy_attnamelist. The
result seems unused before it is overwritten later in this same
function (??)------
7. src/backend/replication/logical/tablesync.c -
fetch_remote_table_info enhancement+ /* Get relation qual */ + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000) + { + resetStringInfo(&cmd); + appendStringInfo(&cmd, + "SELECT pg_get_expr(prqual, prrelid) " + " FROM pg_publication p " + " INNER JOIN pg_publication_rel pr " + " ON (p.oid = pr.prpubid) " + " WHERE pr.prrelid = %u " + " AND p.pubname IN (", lrel->remoteid);=>
I think a small improvement is possible in this SQL.
If we change that to "SELECT DISTINCT pg_get_expr(prqual, prrelid)"...
then it avoids the copy SQL from having multiple WHERE clauses which
are all identical. This could happen when subscribed to multiple
publications which had the same filter for the same table.I attached a tmp POC patch for this change and it works as expected.
For example, I subscribe to 3 publications, but 2 of them have the
same filter for the table.BEFORE
COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND
(key > 1000) AND (key > 1000)) TO STDOUTAFTER
COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND
(key > 1000) ) TO STDOUT------
8. src/backend/replication/pgoutput/pgoutput.c - qual member is redundant
@@ -99,6 +108,9 @@ typedef struct RelationSyncEntry
bool replicate_valid; PublicationActions pubactions; + List *qual; /* row filter */ + List *exprstate; /* ExprState for row filter */ + TupleTableSlot *scantuple; /* tuple table slot for row filter */=>
Now that the exprstate is introduced I think that the other member
"qual" is redundant, so it can be removed.FYI - I attached a tmp patch with all the qual references deleted and
everything is fine.------
9. src/backend/replication/pgoutput/pgoutput.c - comment typo?
+ /* + * Cache ExprState using CacheMemoryContext. This is the same code as + * ExecPrepareExpr() but it is not used because it doesn't use an EState. + * It should probably be another function in the executor to handle the + * execution outside a normal Plan tree context. + */=>
typo: it/that ?
I think it ought to say "This is the same code as ExecPrepareExpr()
but that is not used because"...------
10. src/backend/replication/pgoutput/pgoutput.c - redundant debug logging?
+ /* Evaluates row filter */ + result = pgoutput_row_filter_exec_expr(exprstate, ecxt); + + elog(DEBUG3, "row filter %smatched", result ? "" : "not ");The above debug logging is really only a repeat (with different
wording) of the same information already being logged inside the
pgoutput_row_filter_exec_expr function isn't it? Consider removing the
redundant logging.e.g. This is already getting logged by pgoutput_row_filter_exec_expr:
elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
DatumGetBool(ret) ? "true" : "false",
isnull ? "true" : "false");------
[1] /messages/by-id/532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com
[2] /messages/by-id/849ee491-bba3-c0ae-cc25-4fce1c03f105@enterprisedb.com
[3] /messages/by-id/532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.comKind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v18-0001-fix-tab-auto-complete-CREATE-PUBLICATION.patchapplication/octet-stream; name=v18-0001-fix-tab-auto-complete-CREATE-PUBLICATION.patchDownload
From f782b494ccf3915bee45c1d1e04a2557b4da6a7a Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 13 Jul 2021 15:38:36 +1000
Subject: [PATCH v18 1/2] fix - tab auto-complete CREATE PUBLICATION
Fixes some missing tab auto-completes.
Discussion: https://www.postgresql.org/message-id/CAHut+Ps-vkmnWAShWSRVCB3gx8aM=bFoDqWgBNTzofK0q1LpwA@mail.gmail.com
---
src/bin/psql/tab-complete.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 0ebd5aa..86a8120 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2637,8 +2637,13 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("FOR TABLE", "FOR ALL TABLES", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR"))
COMPLETE_WITH("TABLE", "ALL TABLES");
- /* Complete "CREATE PUBLICATION <name> FOR TABLE <table>, ..." */
- else if (HeadMatches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
+ COMPLETE_WITH("TABLES");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
+ || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WITH (");
+ /* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
/* Complete "CREATE PUBLICATION <name> [...] WITH" */
else if (HeadMatches("CREATE", "PUBLICATION") && TailMatches("WITH", "("))
--
1.8.3.1
v18-0002-tmp-Add-tab-auto-complete-support-for-the-Row-Fi.patchapplication/octet-stream; name=v18-0002-tmp-Add-tab-auto-complete-support-for-the-Row-Fi.patchDownload
From 94d8891bfc149e467bf2af24b15b683a86a09c92 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 13 Jul 2021 16:34:57 +1000
Subject: [PATCH v18 2/2] tmp - Add tab auto-complete support for the Row
Filter WHERE
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
This patch is on top of the one described here: https://www.postgresql.org/message-id/CAHut+Ps-vkmnWAShWSRVCB3gx8aM=bFoDqWgBNTzofK0q1LpwA@mail.gmail.com
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 86a8120..9aeae1d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1644,6 +1644,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2639,9 +2644,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
On Tue, Jul 13, 2021 at 10:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 3:01 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:In terms of implementation, I think there are two basic options - either
we can define a new "expression" type in gram.y, which would be a subset
of a_expr etc. Or we can do it as some sort of expression walker, kinda
like what the transform* functions do now.I think it is better to use some form of walker here rather than
extending the grammar for this. However, the question is do we need
some special kind of expression walker here or can we handle all
required cases via transformWhereClause() call as the patch is trying
to do. AFAIU, the main things we want to prohibit in the filter are:
(a) it doesn't refer to any relation other than catalog in where
clause, (b) it doesn't use UDFs in any way (in expressions, in
user-defined operators, user-defined types, etc.), (c) the columns
referred to in the filter should be part of PK or Replica Identity.
Now, if all such things can be detected by the approach patch has
taken then why do we need a special kind of expression walker? OTOH,
if we can't detect some of this then probably we can use a special
walker.I think in the long run one idea to allow UDFs is probably by
explicitly allowing users to specify whether the function is
publication predicate safe and if so, then we can allow such functions
in the filter clause.
Another idea here could be to read the publication-related catalog
with the latest snapshot instead of a historic snapshot. If we do that
then if the user faces problems as described by Petr [1]/messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com due to
missing dependencies via UDFs then she can Alter the Publication to
remove/change the filter clause and after that, we would be able to
recognize the updated filter clause and the system will be able to
move forward.
I might be missing something but reading publication catalogs with
non-historic snapshots shouldn't create problems as we use the
historic snapshots are required to decode WAL.
I think the problem described by Petr[1]/messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com is also possible today if the
user drops the publication and there is a corresponding subscription,
basically, the system will stuck with error: "ERROR: publication
"mypub" does not exist. I think allowing to use non-historic snapshots
just for publications will resolve that problem as well.
[1]: /messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com
--
With Regards,
Amit Kapila.
On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
to do. AFAIU, the main things we want to prohibit in the filter are:
(a) it doesn't refer to any relation other than catalog in where
clause,
Right, because the walsender is using a historical snapshot.
(b) it doesn't use UDFs in any way (in expressions, in
user-defined operators, user-defined types, etc.),
Is this a reasonable requirement? Postgres has a long history of
allowing UDFs nearly everywhere that a built-in is allowed. It feels
wrong to make built-ins special for this feature.
(c) the columns
referred to in the filter should be part of PK or Replica Identity.
Why?
Also:
* Andres also mentioned that the function should not leak memory.
* One use case for this feature is when sharding a table, so the
expression should allow things like "hashint8(x) between ...". I'd
really like to see this problem solved, as well.
I think in the long run one idea to allow UDFs is probably by
explicitly allowing users to specify whether the function is
publication predicate safe and if so, then we can allow such
functions
in the filter clause.
This sounds like a better direction. We probably need some kind of
catalog information here to say what functions/operators are "safe" for
this kind of purpose. There are a couple questions:
1. Should this notion of safety be specific to this feature, or should
we try to generalize it so that other areas of the system might benefit
as well?
2. Should this marking be superuser-only, or user-specified?
3. Should it be related to the IMMUTABLE/STABLE/VOLATILE designation,
or completely separate?
Regards,
Jeff Davis
On 7/13/21 5:44 PM, Jeff Davis wrote:
On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
to do. AFAIU, the main things we want to prohibit in the filter are:
(a) it doesn't refer to any relation other than catalog in where
clause,Right, because the walsender is using a historical snapshot.
(b) it doesn't use UDFs in any way (in expressions, in
user-defined operators, user-defined types, etc.),Is this a reasonable requirement? Postgres has a long history of
allowing UDFs nearly everywhere that a built-in is allowed. It feels
wrong to make built-ins special for this feature.
Well, we can either prohibit UDF or introduce a massive foot-gun.
The problem with functions in general (let's ignore SQL functions) is
that they're black boxes, so we don't know what's inside. And if the
function gets broken after an object gets dropped, the replication is
broken and the only way to fix it is to recover the subscription.
And this is not hypothetical issue, we've seen this repeatedly :-(
So as much as I'd like to see support for UDFs here, I think it's better
to disallow them - at least for now. And maybe relax that restriction
later, if possible.
(c) the columns
referred to in the filter should be part of PK or Replica Identity.Why?
I'm not sure either.
Also:
* Andres also mentioned that the function should not leak memory.
* One use case for this feature is when sharding a table, so the
expression should allow things like "hashint8(x) between ...". I'd
really like to see this problem solved, as well.
I think built-in functions should be fine, because generally don't get
dropped etc. (And if you drop built-in function, well - sorry.)
Not sure about the memory leaks - I suppose we'd free memory for each
row, so this shouldn't be an issue I guess ...
I think in the long run one idea to allow UDFs is probably by
explicitly allowing users to specify whether the function is
publication predicate safe and if so, then we can allow such
functions
in the filter clause.This sounds like a better direction. We probably need some kind of
catalog information here to say what functions/operators are "safe" for
this kind of purpose. There are a couple questions:
Not sure. It's true it's a bit like volatile/stable/immutable categories
where we can't guarantee those labels are correct, and it's up to the
user to keep the pieces if they pick the wrong category.
But we can achieve the same goal by introducing a simple GUC called
dangerous_allow_udf_in_decoding, I think.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 7/13/21 12:57 PM, Amit Kapila wrote:
On Tue, Jul 13, 2021 at 10:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 3:01 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:In terms of implementation, I think there are two basic options - either
we can define a new "expression" type in gram.y, which would be a subset
of a_expr etc. Or we can do it as some sort of expression walker, kinda
like what the transform* functions do now.I think it is better to use some form of walker here rather than
extending the grammar for this. However, the question is do we need
some special kind of expression walker here or can we handle all
required cases via transformWhereClause() call as the patch is trying
to do. AFAIU, the main things we want to prohibit in the filter are:
(a) it doesn't refer to any relation other than catalog in where
clause, (b) it doesn't use UDFs in any way (in expressions, in
user-defined operators, user-defined types, etc.), (c) the columns
referred to in the filter should be part of PK or Replica Identity.
Now, if all such things can be detected by the approach patch has
taken then why do we need a special kind of expression walker? OTOH,
if we can't detect some of this then probably we can use a special
walker.I think in the long run one idea to allow UDFs is probably by
explicitly allowing users to specify whether the function is
publication predicate safe and if so, then we can allow such functions
in the filter clause.Another idea here could be to read the publication-related catalog
with the latest snapshot instead of a historic snapshot. If we do that
then if the user faces problems as described by Petr [1] due to
missing dependencies via UDFs then she can Alter the Publication to
remove/change the filter clause and after that, we would be able to
recognize the updated filter clause and the system will be able to
move forward.I might be missing something but reading publication catalogs with
non-historic snapshots shouldn't create problems as we use the
historic snapshots are required to decode WAL.
IMHO the best option for v1 is to just restrict the filters to
known-safe expressions. That is, just built-in operators, no UDFs etc.
Yes, it's not great, but both alternative proposals (allowing UDFs or
using current snapshot) are problematic for various reasons.
Even with those restrictions the row filtering seems quite useful, and
we can relax those restrictions later if we find acceptable compromise
and/or decide it's worth the risk. Seems better than having to introduce
new restrictions later.
I think the problem described by Petr[1] is also possible today if the
user drops the publication and there is a corresponding subscription,
basically, the system will stuck with error: "ERROR: publication
"mypub" does not exist. I think allowing to use non-historic snapshots
just for publications will resolve that problem as well.[1] - /messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com
That seems like a completely different problem, TBH. For example the
slot is dropped too, which means the WAL is likely gone etc.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2021-Jul-13, Tomas Vondra wrote:
On 7/13/21 5:44 PM, Jeff Davis wrote:
* Andres also mentioned that the function should not leak memory.
* One use case for this feature is when sharding a table, so the
expression should allow things like "hashint8(x) between ...". I'd
really like to see this problem solved, as well.I think built-in functions should be fine, because generally don't get
dropped etc. (And if you drop built-in function, well - sorry.)Not sure about the memory leaks - I suppose we'd free memory for each row,
so this shouldn't be an issue I guess ...
I'm not sure we need to be terribly strict about expression evaluation
not leaking any memory here. I'd rather have a memory context that can
be reset per row.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Tue, Jul 13, 2021, at 12:25 AM, Peter Smith wrote:
I have reviewed the latest v18 patch. Below are some more review
comments and patches.
Peter, thanks for quickly check the new patch. I'm attaching a new patch (v19)
that addresses (a) this new review, (b) Tomas' review and (c) Greg's review. I
also included the copy/equal node support for the new node (PublicationTable)
mentioned by Tomas in another email.
1. Commit comment - wording
8<
=>
I think this means to say: "Rows that don't satisfy an optional WHERE
clause will be filtered out."
Agreed.
2. Commit comment - wording
"The row filter is per table, which allows different row filters to be
defined for different tables."=>
I think all that is the same as just saying: "The row filter is per table."
Agreed.
3. PG docs - independent improvement
You wrote (ref [1] point 3):
"I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the
root partitioned table is used. We should improve that sentence too."I agree, but that PG docs improvement is independent of your RowFilter
patch; please make another thread for that idea.
I will. And I will also include the next item that I removed from the patch.
4. doc/src/sgml/ref/create_publication.sgml - independent improvement
@@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> on its partitions) contained in the publication will be published using the identity and schema of the partitioned table rather than that of the individual partitions that are actually changed; the - latter is the default. Enabling this allows the changes to be - replicated into a non-partitioned table or a partitioned table - consisting of a different set of partitions. + latter is the default (<literal>false</literal>). Enabling this + allows the changes to be replicated into a non-partitioned table or a + partitioned table consisting of a different set of partitions. </para>I think that Tomas wrote (ref [2] point 2) that this change seems
unrelated to your RowFilter patch.I agree; I liked the change, but IMO you need to propose this one in
another thread too.
Reverted.
5. doc/src/sgml/ref/create_subscription.sgml - wording
8<
I felt that the sentence: "If any table in the publications has a
<literal>WHERE</literal> clause, data synchronization does not use it
if the subscriber is a <productname>PostgreSQL</productname> version
before 15."Could be expressed more simply like: "If the subscriber is a
<productname>PostgreSQL</productname> version before 15 then any row
filtering is ignored."
Agreed.
6. src/backend/commands/publicationcmds.c - wrong function comment
8<
/* * Close all relations in the list. + * + * Publication node can have a different list element, hence, pub_drop_table + * indicates if it has a Relation (true) or PublicationTable (false). */ static void CloseTableList(List *rels)=>
The 2nd parameter does not exist in v18, so that comment about
pub_drop_table seems to be a cut/paste error from the OpenTableList.
Oops. Removed.
src/backend/replication/logical/tablesync.c - bug ?
@@ -829,16 +883,23 @@ copy_table(Relation rel)
relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
Assert(rel == relmapentry->localrel);+ /* List of columns for COPY */ + attnamelist = make_copy_attnamelist(relmapentry); + /* Start copy on the publisher. */ =>I did not understand the above call to make_copy_attnamelist. The
result seems unused before it is overwritten later in this same
function (??)
Good catch. This seems to be a leftover from an ancient version.
7. src/backend/replication/logical/tablesync.c -
fetch_remote_table_info enhancement+ /* Get relation qual */ + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000) + { + resetStringInfo(&cmd); + appendStringInfo(&cmd, + "SELECT pg_get_expr(prqual, prrelid) " + " FROM pg_publication p " + " INNER JOIN pg_publication_rel pr " + " ON (p.oid = pr.prpubid) " + " WHERE pr.prrelid = %u " + " AND p.pubname IN (", lrel->remoteid);=>
I think a small improvement is possible in this SQL.
If we change that to "SELECT DISTINCT pg_get_expr(prqual, prrelid)"...
then it avoids the copy SQL from having multiple WHERE clauses which
are all identical. This could happen when subscribed to multiple
publications which had the same filter for the same table.
Good catch!
8. src/backend/replication/pgoutput/pgoutput.c - qual member is redundant
@@ -99,6 +108,9 @@ typedef struct RelationSyncEntry
bool replicate_valid; PublicationActions pubactions; + List *qual; /* row filter */ + List *exprstate; /* ExprState for row filter */ + TupleTableSlot *scantuple; /* tuple table slot for row filter */=>
Now that the exprstate is introduced I think that the other member
"qual" is redundant, so it can be removed.
I was thinking about it for the next patch. Removed.
9. src/backend/replication/pgoutput/pgoutput.c - comment typo?
8<
typo: it/that ?
I think it ought to say "This is the same code as ExecPrepareExpr()
but that is not used because"...
Fixed.
10. src/backend/replication/pgoutput/pgoutput.c - redundant debug logging?
+ /* Evaluates row filter */ + result = pgoutput_row_filter_exec_expr(exprstate, ecxt); + + elog(DEBUG3, "row filter %smatched", result ? "" : "not ");The above debug logging is really only a repeat (with different
wording) of the same information already being logged inside the
pgoutput_row_filter_exec_expr function isn't it? Consider removing the
redundant logging.
Agreed. Removed.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v19-0001-Row-filter-for-logical-replication.patchtext/x-patch; name=v19-0001-Row-filter-for-logical-replication.patchDownload
From edc9ea832bacb663b47375b2132fe3f9d7482f78 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH v19] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++-
src/backend/commands/publicationcmds.c | 110 +++++---
src/backend/nodes/copyfuncs.c | 14 +
src/backend/nodes/equalfuncs.c | 12 +
src/backend/parser/gram.y | 24 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 +
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++-
src/backend/replication/pgoutput/pgoutput.c | 255 ++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/021_row_filter.pl | 298 ++++++++++++++++++++
26 files changed, 1047 insertions(+), 74 deletions(-)
create mode 100644 src/test/subscription/t/021_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f517a7d4af..dbf2f46c00 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..4bb4314458 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..35006d9ffc 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -182,6 +186,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +233,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index e812beee37..cf9424845d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 86e415af89..b084cea2aa 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 95c253c8e0..659f448c97 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -385,31 +385,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationInfo *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -499,7 +492,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -509,16 +503,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = castNode(RangeVar, lfirst(lc));
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = castNode(RangeVar, lfirst(lc));
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -538,8 +557,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -572,7 +597,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -593,10 +626,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -612,15 +647,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -644,11 +679,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -658,7 +692,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6fef067957..e04797ba58 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4840,6 +4840,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5704,6 +5715,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index b9cc7b199c..06b0adc2b0 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2314,6 +2314,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
return true;
}
+static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195438..d82ea003db 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb502..8fb953b54f 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32311..fc4170e723 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de7da..e946f17c64 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23afc..29f8835ce1 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 682c107e74..a37a6feff1 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -691,19 +691,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -799,6 +803,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -812,6 +869,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -820,7 +878,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -831,14 +889,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -847,8 +909,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index abd5217ab1..7fbf9f4e31 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -99,6 +108,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -122,7 +133,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -131,6 +142,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -520,6 +538,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -547,7 +708,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -571,8 +732,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -580,6 +739,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -603,6 +772,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -631,6 +806,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -694,7 +875,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1005,9 +1186,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1030,6 +1212,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1041,6 +1225,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1054,6 +1239,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1063,6 +1264,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1122,9 +1326,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1242,6 +1470,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1251,6 +1480,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1268,5 +1499,11 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 321152151d..6f944ec60d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4172,6 +4172,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4182,9 +4183,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4193,6 +4201,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4233,6 +4242,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4265,8 +4278,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ba9bc6ddd2..7d72d498c1 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -626,6 +626,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255798..e2e64cb3bf 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad4d4..2703b9c3fe 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504cbb..154bb61777 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417bcd7..2037705f45 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651b34..cf815cc0f2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2dd0..4537543a7b 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 63d6ab7a4e..444f8344bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -156,6 +156,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: "testpub_view" is not a table
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..b1606cce7e 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/021_row_filter.pl b/src/test/subscription/t/021_row_filter.pl
new file mode 100644
index 0000000000..0f6d2f0128
--- /dev/null
+++ b/src/test/subscription/t/021_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
On Sun, Jul 11, 2021, at 8:09 PM, Tomas Vondra wrote:
I took a look at this patch, which seems to be in CF since 2018. I have
only some basic comments and observations at this point:
Tomas, thanks for reviewing this patch again.
1) alter_publication.sgml
I think "expression is executed" sounds a bit strange, perhaps
"evaluated" would be better?
Fixed.
2) create_publication.sgml
Why is the patch changing publish_via_partition_root docs? That seems
like a rather unrelated bit.
Removed. I will submit a separate patch for this.
The <literal>WHERE</literal> clause should probably contain only
columns that are part of the primary key or be covered by
<literal>REPLICA ...I'm not sure what exactly is this trying to say. What does "should
probably ..." mean in practice for the users? Does that mean something
bad will happen for other columns, or what? I'm sure this wording will
be quite confusing for users.
Reading again it seems "probably" is confusing. Let's remove it.
It may also be unclear whether the condition is evaluated on the old or
new row, so perhaps add an example illustrating that & more detailed
comment, or something. E.g. what will happen withUPDATE departments SET active = false WHERE active;
Yeah. I avoided to mention this internal detail about old/new row but it seems
better to be clear. How about the following paragraph?
<para>
The <literal>WHERE</literal> clause should contain only columns that are
part of the primary key or be covered by <literal>REPLICA
IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
be replicated. That's because old row is used and it only contains primary
key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
and <command>UPDATE</command> operations, any column might be used in the
<literal>WHERE</literal> clause. New row is used and it contains all
columns. A <literal>NULL</literal> value causes the expression to evaluate
to false; avoid using columns without not-null constraints in the
<literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
not allow functions and user-defined operators.
</para>
3) publication_add_relation
Does this need to build the parse state even for whereClause == NULL?
No. Fixed.
4) AlterPublicationTables
I wonder if this new reworked code might have issues with subscriptions
containing many tables, but I haven't tried.
This piece of code is already complicated. Amit complained about it too [1]/messages/by-id/CA+HiwqG3Jz-cRS=4gqXmZDjDAi==19GvrFCCqAawwHcOCEn4fQ@mail.gmail.com.
Are you envisioning any specific issue (other than open thousands of relations,
do some stuff, and close them all)? IMO the open/close relation should be
postponed for as long as possible.
5) OpenTableList
I really dislike that the list can have two different node types
(Relation and PublicationTable). In principle we don't actually need the
extra flag, we can simply check the node type directly by IsA() and act
based on that. However, I think it'd be better to just use a single node
type from all places.
Amit complained about having a runtime test for ALTER PUBLICATION ... DROP
TABLE in case user provides a WHERE clause [2]/messages/by-id/CAA4eK1Lu7oPHm2j=nLeqZLVoro76E0EWvH+5wmGG39iJNBzUog@mail.gmail.com. I did that way (runtime test)
because it simplified the code. I would tend to avoid moving grammar task into
a runtime, that's why I agreed to change it. I didn't like the multi-node
argument handling for OpenTableList() (mainly because of the extra argument in
the function signature) but with your suggestion (IsA()) maybe it is
acceptable. What do you think? I included IsA() in v19.
I don't see why not to set whereClause every time, I don't think the
extra if saves anything, it's just a bit more complex.
See runtime test in [2]/messages/by-id/CAA4eK1Lu7oPHm2j=nLeqZLVoro76E0EWvH+5wmGG39iJNBzUog@mail.gmail.com.
5) CloseTableList
The comment about node types seems pointless, this function has no flag
and the element type does not matter.
Fixed.
6) parse_agg.c
... are not allowed in publication WHERE expressions
I think all similar cases use "WHERE conditions" instead.
No. Policy, index, statistics, partition, column generation use expressions.
COPY and trigger use conditions. It is also referred as expression in the
synopsis.
7) transformExprRecurse
The check at the beginning seems rather awkward / misplaced - it's way
too specific for this location (there are no other p_expr_kind
references in this function). Wouldn't transformFuncCall (or maybe
ParseFuncOrColumn) be a more appropriate place?
Probably. I have to try the multiple possibilities to make sure it forbids all
cases.
Initially I was wondering why not to allow function calls in WHERE
conditions, but I see that was discussed in the past as problematic. But
that reminds me that I don't see any docs describing what expressions
are allowed in WHERE conditions - maybe we should explicitly list what
expressions are allowed?
I started to investigate how to safely allow built-in functions. There is a
long discussion about using functions in a logical decoding context. As I said
during the last CF for v14, I prefer this to be a separate feature. I realized
that I mentioned that functions and user-defined operators are not allowed in
the commit message but forgot to mention it in the documentation.
8) pgoutput.c
I have not reviewed this in detail yet, but there seems to be something
wrong because `make check-world` fails in subscription/010_truncate.pl
after hitting an assert (backtrace attached) during "START_REPLICATION
SLOT" in get_rel_sync_entry in this code:
That's because I didn't copy the TupleDesc in CacheMemoryContext. Greg pointed
it too in a previous email [3]/messages/by-id/CAJcOf-d70xg1O2jX1CrUeXaj+nMas3+NyJwYjbRsK6ZctH+x5Q@mail.gmail.com. The new patch (v19) includes a fix for it.
[1]: /messages/by-id/CA+HiwqG3Jz-cRS=4gqXmZDjDAi==19GvrFCCqAawwHcOCEn4fQ@mail.gmail.com
[2]: /messages/by-id/CAA4eK1Lu7oPHm2j=nLeqZLVoro76E0EWvH+5wmGG39iJNBzUog@mail.gmail.com
[3]: /messages/by-id/CAJcOf-d70xg1O2jX1CrUeXaj+nMas3+NyJwYjbRsK6ZctH+x5Q@mail.gmail.com
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Tue, Jul 13, 2021, at 4:07 PM, Tomas Vondra wrote:
On 7/13/21 5:44 PM, Jeff Davis wrote:
On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
8<
(c) the columns
referred to in the filter should be part of PK or Replica Identity.Why?
I'm not sure either.
This patch uses the old row for DELETE operations and new row for INSERT and
UPDATE operations. Since we usually don't use REPLICA IDENTITY FULL, all
columns in an old row that are not part of the PK or REPLICA IDENTITY are NULL.
The row filter evaluates NULL to false. Documentation says
<para>
The <literal>WHERE</literal> clause should contain only columns that are
part of the primary key or be covered by <literal>REPLICA
IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
be replicated. That's because old row is used and it only contains primary
key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
and <command>UPDATE</command> operations, any column might be used in the
<literal>WHERE</literal> clause. New row is used and it contains all
columns. A <literal>NULL</literal> value causes the expression to evaluate
to false; avoid using columns without not-null constraints in the
<literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
not allow functions and user-defined operators.
</para>
--
Euler Taveira
EDB https://www.enterprisedb.com/
On 2021-Jul-13, Euler Taveira wrote:
+ <para> + The <literal>WHERE</literal> clause should contain only columns that are + part of the primary key or be covered by <literal>REPLICA + IDENTITY</literal> otherwise, <command>DELETE</command> operations will not + be replicated. That's because old row is used and it only contains primary + key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the + remaining columns are <literal>NULL</literal>. For <command>INSERT</command> + and <command>UPDATE</command> operations, any column might be used in the + <literal>WHERE</literal> clause. New row is used and it contains all + columns. A <literal>NULL</literal> value causes the expression to evaluate + to false; avoid using columns without not-null constraints in the + <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does + not allow functions and user-defined operators. + </para>
There's a couple of points in this paragraph ..
1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)
2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)
I think it is sensible that it's the old tuple that is matched, not the
new; consider what happens if you change the PK in the update and the
replica already has that tuple. If you match on the new tuple and it
doesn't match the expression (so you filter out the update), but the old
tuple does match the expression, then the replica will retain the
mismatching tuple forever.
3. You say that a NULL value in any of those columns causes the
expression to become false and thus the tuple is not published. This
seems pretty unfriendly, but maybe it would be useful to have examples
of the behavior. Does ExecInitCheck() handle things in the other way,
and if so does using a similar trick give more useful behavior?
<para>
The WHERE clause may only contain references to columns that are part
of the table's replica identity.
If <>DELETE</> or <>UPDATE</> operations are published, this
restriction can be bypassed by making the replica identity be the whole
row with <command>ALTER TABLE .. SET REPLICA IDENTITY FULL</command>.
The <literal>WHERE</literal> clause does not allow functions or
user-defined operators.
</para>
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"The Gord often wonders why people threaten never to come back after they've
been told never to return" (www.actsofgord.com)
On Tue, Jul 13, 2021, at 6:06 PM, Alvaro Herrera wrote:
1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)
I thought about it but came to the conclusion that it doesn't worth it. Even
with REPLICA IDENTITY FULL expression evaluates to false if the column allows
NULL values. Besides that REPLICA IDENTITY is changed via another DDL (ALTER
TABLE) and you have to make sure you don't allow changing REPLICA IDENTITY
because some row filter uses the column you want to remove from it.
2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)
New tuple. The main reason is that new tuple is always there for UPDATEs.
Hence, row filter might succeed even if the row filter contains a column that
is not part of PK or REPLICA IDENTITY. pglogical also chooses to use new tuple
when it is available (e.g. for INSERT and UPDATE). If you don't like this
approach we can (a) create a new publication option to choose between old tuple
and new tuple for UPDATEs or (b) qualify columns using a special reference
(such as NEW.id or OLD.foo). Both options can provide flexibility but (a) is
simpler.
I think it is sensible that it's the old tuple that is matched, not the
new; consider what happens if you change the PK in the update and the
replica already has that tuple. If you match on the new tuple and it
doesn't match the expression (so you filter out the update), but the old
tuple does match the expression, then the replica will retain the
mismatching tuple forever.3. You say that a NULL value in any of those columns causes the
expression to become false and thus the tuple is not published. This
seems pretty unfriendly, but maybe it would be useful to have examples
of the behavior. Does ExecInitCheck() handle things in the other way,
and if so does using a similar trick give more useful behavior?
ExecInitCheck() is designed for CHECK constraints and SQL standard requires
taht NULL constraint conditions are not treated as errors. This feature uses a
WHERE clause and behaves like it. I mean, a NULL result does not return the
row. See ExecQual().
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Jul 14, 2021 at 6:28 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Jul 13, 2021, at 6:06 PM, Alvaro Herrera wrote:
1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)
+1.
I thought about it but came to the conclusion that it doesn't worth it. Even
with REPLICA IDENTITY FULL expression evaluates to false if the column allows
NULL values. Besides that REPLICA IDENTITY is changed via another DDL (ALTER
TABLE) and you have to make sure you don't allow changing REPLICA IDENTITY
because some row filter uses the column you want to remove from it.
Yeah, that is required but is it not feasible to do so?
2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)New tuple. The main reason is that new tuple is always there for UPDATEs.
I am not sure if that is a very good reason to use a new tuple.
--
With Regards,
Amit Kapila.
On Wed, Jul 14, 2021 at 12:51 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jul-13, Tomas Vondra wrote:
On 7/13/21 5:44 PM, Jeff Davis wrote:
* Andres also mentioned that the function should not leak memory.
* One use case for this feature is when sharding a table, so the
expression should allow things like "hashint8(x) between ...". I'd
really like to see this problem solved, as well.
..
Not sure about the memory leaks - I suppose we'd free memory for each row,
so this shouldn't be an issue I guess ...I'm not sure we need to be terribly strict about expression evaluation
not leaking any memory here. I'd rather have a memory context that can
be reset per row.
I also think that should be sufficient here and if I am reading
correctly patch already evaluates the expression in per-tuple context
and reset it for each tuple. Jeff, do you or Andres have something
else in mind?
--
With Regards,
Amit Kapila.
On Wed, Jul 14, 2021 at 12:37 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/13/21 5:44 PM, Jeff Davis wrote:
On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
Also:* Andres also mentioned that the function should not leak memory.
* One use case for this feature is when sharding a table, so the
expression should allow things like "hashint8(x) between ...". I'd
really like to see this problem solved, as well.I think built-in functions should be fine, because generally don't get
dropped etc. (And if you drop built-in function, well - sorry.)
I am not sure if all built-in functions are also safe. I think we
can't allow volatile functions (ex. setval) that can update the
database which doesn't seem to be allowed in the historic snapshot.
Similarly, it might not be okay to invoke stable functions that access
the database as those might expect current snapshot. I think immutable
functions should be okay but that brings us to Jeff's question of can
we tie the marking of functions that can be used here with
IMMUTABLE/STABLE/VOLATILE designation? The UDFs might have a higher
risk that something used in those functions can be dropped but I guess
we can address that by using the current snapshot to access the
publication catalog.
Not sure about the memory leaks - I suppose we'd free memory for each
row, so this shouldn't be an issue I guess ...I think in the long run one idea to allow UDFs is probably by
explicitly allowing users to specify whether the function is
publication predicate safe and if so, then we can allow such
functions
in the filter clause.This sounds like a better direction. We probably need some kind of
catalog information here to say what functions/operators are "safe" for
this kind of purpose. There are a couple questions:Not sure. It's true it's a bit like volatile/stable/immutable categories
where we can't guarantee those labels are correct, and it's up to the
user to keep the pieces if they pick the wrong category.But we can achieve the same goal by introducing a simple GUC called
dangerous_allow_udf_in_decoding, I think.
One guc for all UDFs sounds dangerous.
--
With Regards,
Amit Kapila.
On Wed, Jul 14, 2021 at 12:45 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/13/21 12:57 PM, Amit Kapila wrote:
On Tue, Jul 13, 2021 at 10:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think the problem described by Petr[1] is also possible today if the
user drops the publication and there is a corresponding subscription,
basically, the system will stuck with error: "ERROR: publication
"mypub" does not exist. I think allowing to use non-historic snapshots
just for publications will resolve that problem as well.[1] - /messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com
That seems like a completely different problem, TBH. For example the
slot is dropped too, which means the WAL is likely gone etc.
I think if we can use WAL archive (if available) and re-create the
slot, the system should move but recreating the publication won't
allow the system to move.
--
With Regards,
Amit Kapila.
On 7/14/21 7:39 AM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 6:28 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Jul 13, 2021, at 6:06 PM, Alvaro Herrera wrote:
1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)+1.
I thought about it but came to the conclusion that it doesn't worth it. Even
with REPLICA IDENTITY FULL expression evaluates to false if the column allows
NULL values. Besides that REPLICA IDENTITY is changed via another DDL (ALTER
TABLE) and you have to make sure you don't allow changing REPLICA IDENTITY
because some row filter uses the column you want to remove from it.Yeah, that is required but is it not feasible to do so?
2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)New tuple. The main reason is that new tuple is always there for UPDATEs.
I am not sure if that is a very good reason to use a new tuple.
True. Perhaps we should look at other places with similar concept of
WHERE conditions and old/new rows, and try to be consistent with those?
I can think of:
1) updatable views with CHECK option
2) row-level security
3) triggers
Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/14/21 7:39 AM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 6:28 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Jul 13, 2021, at 6:06 PM, Alvaro Herrera wrote:
1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)+1.
I thought about it but came to the conclusion that it doesn't worth it. Even
with REPLICA IDENTITY FULL expression evaluates to false if the column allows
NULL values. Besides that REPLICA IDENTITY is changed via another DDL (ALTER
TABLE) and you have to make sure you don't allow changing REPLICA IDENTITY
because some row filter uses the column you want to remove from it.Yeah, that is required but is it not feasible to do so?
2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)New tuple. The main reason is that new tuple is always there for UPDATEs.
I am not sure if that is a very good reason to use a new tuple.
True. Perhaps we should look at other places with similar concept of
WHERE conditions and old/new rows, and try to be consistent with those?I can think of:
1) updatable views with CHECK option
2) row-level security
3) triggers
Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?
I think for insert we are only allowing those rows to replicate which
are matching filter conditions, so if we updating any row then also we
should maintain that sanity right? That means at least on the NEW rows
we should apply the filter, IMHO. Said that, now if there is any row
inserted which were satisfying the filter and replicated, if we update
it with the new value which is not satisfying the filter then it will
not be replicated, I think that makes sense because if an insert is
not sending any row to a replica which is not satisfying the filter
then why update has to do that, right?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 14, 2021 at 6:38 AM Euler Taveira <euler@eulerto.com> wrote:
Peter, thanks for quickly check the new patch. I'm attaching a new patch (v19)
that addresses (a) this new review, (b) Tomas' review and (c) Greg's review. I
also included the copy/equal node support for the new node (PublicationTable)
mentioned by Tomas in another email.
Some minor v19 patch review points you might consider for your next
patch version:
(I'm still considering the other issues raised about WHERE clauses and
filtering)
(1) src/backend/commands/publicationcmds.c
OpenTableList
Some suggested abbreviations:
BEFORE:
if (IsA(lfirst(lc), PublicationTable))
whereclause = true;
else
whereclause = false;
AFTER:
whereclause = IsA(lfirst(lc), PublicationTable);
BEFORE:
if (whereclause)
pri->whereClause = t->whereClause;
else
pri->whereClause = NULL;
AFTER:
pri->whereClause = whereclause? t->whereClause : NULL;
(2) src/backend/parser/parse_expr.c
I think that the check below:
/* Functions are not allowed in publication WHERE clauses */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE &&
nodeTag(expr) == T_FuncCall)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
should be moved down into the "T_FuncCall" case of the switch
statement below it, so that "if (pstate->p_expr_kind ==
EXPR_KIND_PUBLICATION_WHERE" doesn't get checked every call to
transformExprRecurse() regardless of the expression Node type.
(3) Save a nanosecond when entry->exprstate is already NIL:
BEFORE:
if (entry->exprstate != NIL)
list_free_deep(entry->exprstate);
entry->exprstate = NIL;
AFTER:
if (entry->exprstate != NIL)
{
list_free_deep(entry->exprstate);
entry->exprstate = NIL;
}
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/14/21 7:39 AM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 6:28 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Jul 13, 2021, at 6:06 PM, Alvaro Herrera wrote:
1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)+1.
I thought about it but came to the conclusion that it doesn't worth it. Even
with REPLICA IDENTITY FULL expression evaluates to false if the column allows
NULL values. Besides that REPLICA IDENTITY is changed via another DDL (ALTER
TABLE) and you have to make sure you don't allow changing REPLICA IDENTITY
because some row filter uses the column you want to remove from it.Yeah, that is required but is it not feasible to do so?
2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)New tuple. The main reason is that new tuple is always there for UPDATEs.
I am not sure if that is a very good reason to use a new tuple.
True. Perhaps we should look at other places with similar concept of
WHERE conditions and old/new rows, and try to be consistent with those?I can think of:
1) updatable views with CHECK option
2) row-level security
3) triggers
Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?
I think apart from the above, it might be good if we can find what
some other databases does in this regard?
--
With Regards,
Amit Kapila.
On 2021-Jul-14, Dilip Kumar wrote:
I think for insert we are only allowing those rows to replicate which
are matching filter conditions, so if we updating any row then also we
should maintain that sanity right? That means at least on the NEW rows
we should apply the filter, IMHO. Said that, now if there is any row
inserted which were satisfying the filter and replicated, if we update
it with the new value which is not satisfying the filter then it will
not be replicated, I think that makes sense because if an insert is
not sending any row to a replica which is not satisfying the filter
then why update has to do that, right?
Right, that's a good aspect to think about.
I think the guiding principle for which tuple to use for the filter is
what is most useful to the potential user of the feature, rather than
what is the easiest to implement.
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"La libertad es como el dinero; el que no la sabe emplear la pierde" (Álvarez)
On 7/14/21 4:01 PM, Alvaro Herrera wrote:
On 2021-Jul-14, Dilip Kumar wrote:
I think for insert we are only allowing those rows to replicate which
are matching filter conditions, so if we updating any row then also we
should maintain that sanity right? That means at least on the NEW rows
we should apply the filter, IMHO. Said that, now if there is any row
inserted which were satisfying the filter and replicated, if we update
it with the new value which is not satisfying the filter then it will
not be replicated, I think that makes sense because if an insert is
not sending any row to a replica which is not satisfying the filter
then why update has to do that, right?Right, that's a good aspect to think about.
I agree, that seems like a reasonable approach.
The way I'm thinking about this is that for INSERT and DELETE it's clear
which row version should be used (because there's just one). And for
UPDATE we could see that as DELETE + INSERT, and apply the same rule to
each action.
On the other hand, I can imagine cases where it'd be useful to send the
UPDATE when the old row matches the condition and new row does not.
I think the guiding principle for which tuple to use for the filter is
what is most useful to the potential user of the feature, rather than
what is the easiest to implement.
+1
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 7/14/21 2:50 PM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 7/14/21 7:39 AM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 6:28 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Jul 13, 2021, at 6:06 PM, Alvaro Herrera wrote:
1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)+1.
I thought about it but came to the conclusion that it doesn't worth it. Even
with REPLICA IDENTITY FULL expression evaluates to false if the column allows
NULL values. Besides that REPLICA IDENTITY is changed via another DDL (ALTER
TABLE) and you have to make sure you don't allow changing REPLICA IDENTITY
because some row filter uses the column you want to remove from it.Yeah, that is required but is it not feasible to do so?
2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)New tuple. The main reason is that new tuple is always there for UPDATEs.
I am not sure if that is a very good reason to use a new tuple.
True. Perhaps we should look at other places with similar concept of
WHERE conditions and old/new rows, and try to be consistent with those?I can think of:
1) updatable views with CHECK option
2) row-level security
3) triggers
Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?I think apart from the above, it might be good if we can find what
some other databases does in this regard?
Yeah, that might tell us what the users would like to do with it. I did
some quick search, but haven't found much :-( The one thing I found is
that Debezium [1]https://wanna-joke.com/wp-content/uploads/2015/01/german-translation-comics-science.jpg allows accessing both the "old" and "new" rows through
value.before and value.after, and use both for filtering.
I haven't found much about how this works in other databases, sadly.
Perhaps the best way forward is to stick to the approach that INSERT
uses new, DELETE uses old and UPDATE works as DELETE+INSERT (probably),
and leave anything fancier (like being able to reference both versions
of the row) for a future patch.
[1]: https://wanna-joke.com/wp-content/uploads/2015/01/german-translation-comics-science.jpg
https://wanna-joke.com/wp-content/uploads/2015/01/german-translation-comics-science.jpg
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jul 14, 2021 at 8:04 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Perhaps the best way forward is to stick to the approach that INSERT
uses new, DELETE uses old and UPDATE works as DELETE+INSERT (probably),
and leave anything fancier (like being able to reference both versions
of the row) for a future patch.
If UPDATE works as DELETE+ INSERT, does that mean both the OLD row and
the NEW row should satisfy the filter, then only it will be sent?
That means if we insert a row that is not satisfying the condition
(which is not sent to the subscriber) and later if we update that row
and change the values such that the modified value matches the filter
then we will not send it because only the NEW row is satisfying the
condition but OLD row doesn't. I am just trying to understand your
idea. Or you are saying that in this case, we will not send anything
for the OLD row as it was not satisfying the condition but the
modified row will be sent as an INSERT operation because this is
satisfying the condition?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On 2021-Jul-14, Tomas Vondra wrote:
The way I'm thinking about this is that for INSERT and DELETE it's clear
which row version should be used (because there's just one). And for UPDATE
we could see that as DELETE + INSERT, and apply the same rule to each
action.On the other hand, I can imagine cases where it'd be useful to send the
UPDATE when the old row matches the condition and new row does not.
In any case, it seems to me that the condition expression should be
scanned to see which columns are used in Vars (pull_varattnos?), and
verify if those columns are in the REPLICA IDENTITY; and if they are
not, raise an error. Most of the time the REPLICA IDENTITY is going to
be the primary key; but if the user wants to use other columns in the
expression, we can HINT that they can set REPLICA IDENTITY FULL.
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
<Schwern> It does it in a really, really complicated way
<crab> why does it need to be complicated?
<Schwern> Because it's MakeMaker.
On 7/14/21 4:48 PM, Dilip Kumar wrote:
On Wed, Jul 14, 2021 at 8:04 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Perhaps the best way forward is to stick to the approach that INSERT
uses new, DELETE uses old and UPDATE works as DELETE+INSERT (probably),
and leave anything fancier (like being able to reference both versions
of the row) for a future patch.If UPDATE works as DELETE+ INSERT, does that mean both the OLD row and
the NEW row should satisfy the filter, then only it will be sent?
That means if we insert a row that is not satisfying the condition
(which is not sent to the subscriber) and later if we update that row
and change the values such that the modified value matches the filter
then we will not send it because only the NEW row is satisfying the
condition but OLD row doesn't. I am just trying to understand your
idea. Or you are saying that in this case, we will not send anything
for the OLD row as it was not satisfying the condition but the
modified row will be sent as an INSERT operation because this is
satisfying the condition?
Good questions. I'm not sure, I probably have not thought it through.
So yeah, I think we should probably stick to the principle that what we
send needs to match the filter condition, which applied to this case
would mean we should be looking at the new row version.
The more elaborate scenarios can be added later by a patch allowing to
explicitly reference the old/new row versions.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 7/14/21 4:52 PM, Alvaro Herrera wrote:
On 2021-Jul-14, Tomas Vondra wrote:
The way I'm thinking about this is that for INSERT and DELETE it's clear
which row version should be used (because there's just one). And for UPDATE
we could see that as DELETE + INSERT, and apply the same rule to each
action.On the other hand, I can imagine cases where it'd be useful to send the
UPDATE when the old row matches the condition and new row does not.In any case, it seems to me that the condition expression should be
scanned to see which columns are used in Vars (pull_varattnos?), and
verify if those columns are in the REPLICA IDENTITY; and if they are
not, raise an error. Most of the time the REPLICA IDENTITY is going to
be the primary key; but if the user wants to use other columns in the
expression, we can HINT that they can set REPLICA IDENTITY FULL.
Yeah, but AFAIK that's needed only when replicating DELETEs, so perhaps
we could ignore this for subscriptions without DELETE.
The other question is when to check/enforce this. I guess we'll have to
do that during decoding, not just when the publication is being created,
because the user can do ALTER TABLE later.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2021-Jul-14, Tomas Vondra wrote:
On 7/14/21 4:52 PM, Alvaro Herrera wrote:
In any case, it seems to me that the condition expression should be
scanned to see which columns are used in Vars (pull_varattnos?), and
verify if those columns are in the REPLICA IDENTITY; and if they are
not, raise an error. Most of the time the REPLICA IDENTITY is going to
be the primary key; but if the user wants to use other columns in the
expression, we can HINT that they can set REPLICA IDENTITY FULL.Yeah, but AFAIK that's needed only when replicating DELETEs, so perhaps we
could ignore this for subscriptions without DELETE.
Yeah, I said that too in my older reply :-)
The other question is when to check/enforce this. I guess we'll have to do
that during decoding, not just when the publication is being created,
because the user can do ALTER TABLE later.
... if you're saying the user can change the replica identity after we
have some publications with filters defined, then I think we should
verify during ALTER TABLE and not allow the change if there's a
publication that requires it. I mean, during decoding we should be able
to simply assume that the tuple is correct for what we need at that
point.
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
On Wed, Jul 14, 2021, at 11:48 AM, Dilip Kumar wrote:
On Wed, Jul 14, 2021 at 8:04 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Perhaps the best way forward is to stick to the approach that INSERT
uses new, DELETE uses old and UPDATE works as DELETE+INSERT (probably),
and leave anything fancier (like being able to reference both versions
of the row) for a future patch.If UPDATE works as DELETE+ INSERT, does that mean both the OLD row and
the NEW row should satisfy the filter, then only it will be sent?
That means if we insert a row that is not satisfying the condition
(which is not sent to the subscriber) and later if we update that row
and change the values such that the modified value matches the filter
then we will not send it because only the NEW row is satisfying the
condition but OLD row doesn't. I am just trying to understand your
idea. Or you are saying that in this case, we will not send anything
for the OLD row as it was not satisfying the condition but the
modified row will be sent as an INSERT operation because this is
satisfying the condition?
That's a fair argument for the default UPDATE behavior. It seems we have a
consensus that UPDATE operation will use old row. If there is no objections, I
will change it in the next version.
We can certainly discuss the possibilities for UPDATE operations. It can choose
which row to use: old, new or both (using an additional publication argument or
OLD and NEW placeholders to reference old and new rows are feasible ideas).
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Jul 14, 2021, at 12:08 PM, Tomas Vondra wrote:
Yeah, but AFAIK that's needed only when replicating DELETEs, so perhaps
we could ignore this for subscriptions without DELETE.
... and UPDATE. It seems we have a consensus to use old row in the row filter
for UPDATEs. I think you meant publication.
The other question is when to check/enforce this. I guess we'll have to
do that during decoding, not just when the publication is being created,
because the user can do ALTER TABLE later.
I'm afraid this check during decoding has a considerable cost. If we want to
enforce this condition, I suggest that we add it to CREATE PUBLICATION, ALTER
PUBLICATION ... ADD|SET TABLE and ALTER TABLE ... REPLICA IDENTITY. Data are
being constantly modified; schema is not.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Jul 14, 2021, at 8:21 AM, Greg Nancarrow wrote:
Some minor v19 patch review points you might consider for your next
patch version:
Greg, thanks for another review. I agree with all of these changes. It will be
in the next patch.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Jul 14, 2021 at 8:43 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jul-14, Tomas Vondra wrote:
The other question is when to check/enforce this. I guess we'll have to do
that during decoding, not just when the publication is being created,
because the user can do ALTER TABLE later.... if you're saying the user can change the replica identity after we
have some publications with filters defined, then I think we should
verify during ALTER TABLE and not allow the change if there's a
publication that requires it. I mean, during decoding we should be able
to simply assume that the tuple is correct for what we need at that
point.
+1.
--
With Regards,
Amit Kapila.
On Wed, Jul 14, 2021 at 10:55 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Jul 14, 2021, at 12:08 PM, Tomas Vondra wrote:
Yeah, but AFAIK that's needed only when replicating DELETEs, so perhaps
we could ignore this for subscriptions without DELETE.... and UPDATE. It seems we have a consensus to use old row in the row filter
for UPDATEs. I think you meant publication.
If I read correctly people are suggesting to use a new row for updates
but I still suggest completing the analysis (or at least spend some
more time) Tomas and I requested in the few emails above and then
conclude on this point.
--
With Regards,
Amit Kapila.
On Thu, Jul 15, 2021 at 7:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jul 14, 2021 at 10:55 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Jul 14, 2021, at 12:08 PM, Tomas Vondra wrote:
Yeah, but AFAIK that's needed only when replicating DELETEs, so perhaps
we could ignore this for subscriptions without DELETE.... and UPDATE. It seems we have a consensus to use old row in the row filter
for UPDATEs. I think you meant publication.If I read correctly people are suggesting to use a new row for updates
Right
but I still suggest completing the analysis (or at least spend some
more time) Tomas and I requested in the few emails above and then
conclude on this point.
+1
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 14, 2021 at 10:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think apart from the above, it might be good if we can find what
some other databases does in this regard?
I did a bit of investigation in the case of Oracle Database and SQL Server.
(purely from my interpretation of available documentation; I did not
actually use the replication software)
For Oracle (GoldenGate), it appears that it provides the ability for
filters to reference both OLD and NEW rows in replication of UPDATEs:
"For update operations, it can be advantageous to retrieve the before
values of source columns: the values before the update occurred. These
values are stored in the trail and can be used in filters and column
mappings"
It provides @BEFORE and @AFTER functions for this.
For SQL Server, the available replication models seem quite different
to that in PostgreSQL, and not all seem to support row filtering.
For "snapshot replication", it seems that it effectively supports
filtering rows on the NEW values.
It seems that the snapshot is taken at a transactional boundary and
rows included according to any filtering, and is then replicated.
So to include the result of a particular UPDATE in the replication,
the replication row filtering would effectively be done on the result
(NEW) rows.
Another type of replication that supports row filtering is "merge
replication", which again seems to be effectively based on NEW rows:
"For merge replication to process a row, the data in the row must
satisfy the row filter, and it must have changed since the last
synchronization"
It's not clear to me if there is ANY way to filter on the OLD row
values by using some option.
If anybody has experience with the replication software for these
other databases and I've interpreted the documentation for these
incorrectly, please let me know.
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Jul 15, 2021 at 4:30 AM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Jul 14, 2021, at 8:21 AM, Greg Nancarrow wrote:
Some minor v19 patch review points you might consider for your next
patch version:Greg, thanks for another review. I agree with all of these changes. It will be
in the next patch.
Hi, here are a couple more minor review comments for the V19 patch.
(The 2nd one overlaps a bit with one that Greg previously gave).
//////
1. doc/src/sgml/ref/create_publication.sgml
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
=>
typo: "and user-defined operators." --> "or user-defined operators."
------
2. src/backend/commands/publicationcmds.c - OpenTableList IsA logic
IIUC the tables list can only consist of one kind of list element.
Since there is no expected/permitted "mixture" of kinds then there is
no need to check the IsA within the loop like v19 is doing; instead
you can check only the list head element. If you want to, then you
could Assert that every list element has a consistent kind as the
initial kind, but maybe that is overkill too?
PSA a small tmp patch to demonstrate what this comment is about.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v19-0001-PS-tmp-OpenTableList-IsA-logic.patchapplication/octet-stream; name=v19-0001-PS-tmp-OpenTableList-IsA-logic.patchDownload
From 5b3c45f8e7c32d3bf3e53b9874d0c12a4fdc9ac2 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 16 Jul 2021 12:23:39 +1000
Subject: [PATCH v19] PS - tmp - OpenTableList IsA logic
Since all elements of the list must be same kind only really need to
check the initial element kind, not every element.
---
src/backend/commands/publicationcmds.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 659f448..c323490 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -504,6 +504,14 @@ OpenTableList(List *tables)
List *rels = NIL;
ListCell *lc;
PublicationRelationInfo *pri;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ whereclause = IsA(linitial(tables), PublicationTable);
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -515,17 +523,9 @@ OpenTableList(List *tables)
bool recurse;
Relation rel;
Oid myrelid;
- bool whereclause;
- /*
- * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
- * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
- * a Relation List. Check the List element to be used.
- */
- if (IsA(lfirst(lc), PublicationTable))
- whereclause = true;
- else
- whereclause = false;
+ /* Assert all list elements must be of same kind. */
+ Assert(whereclause == IsA(lfirst(lc), PublicationTable));
if (whereclause)
{
--
1.8.3.1
On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?I think for insert we are only allowing those rows to replicate which
are matching filter conditions, so if we updating any row then also we
should maintain that sanity right? That means at least on the NEW rows
we should apply the filter, IMHO. Said that, now if there is any row
inserted which were satisfying the filter and replicated, if we update
it with the new value which is not satisfying the filter then it will
not be replicated, I think that makes sense because if an insert is
not sending any row to a replica which is not satisfying the filter
then why update has to do that, right?
There is another theory in this regard which is what if the old row
(created by the previous insert) is not sent to the subscriber as that
didn't match the filter but after the update, we decide to send it
because the updated row (new row) matches the filter condition. In
this case, I think it will generate an update conflict on the
subscriber as the old row won't be present. As of now, we just skip
the update but in the future, we might have some conflict handling
there. If this is true then even if the new row matches the filter,
there is no guarantee that it will be applied on the subscriber-side
unless the old row also matches the filter. Sure, there could be a
case where the user might have changed the filter between insert and
update but maybe we can have a separate way to deal with such cases if
required like providing some provision where the user can specify
whether it would like to match old/new row in updates?
--
With Regards,
Amit Kapila.
On Fri, Jul 16, 2021 at 8:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?I think for insert we are only allowing those rows to replicate which
are matching filter conditions, so if we updating any row then also we
should maintain that sanity right? That means at least on the NEW rows
we should apply the filter, IMHO. Said that, now if there is any row
inserted which were satisfying the filter and replicated, if we update
it with the new value which is not satisfying the filter then it will
not be replicated, I think that makes sense because if an insert is
not sending any row to a replica which is not satisfying the filter
then why update has to do that, right?There is another theory in this regard which is what if the old row
(created by the previous insert) is not sent to the subscriber as that
didn't match the filter but after the update, we decide to send it
because the updated row (new row) matches the filter condition. In
this case, I think it will generate an update conflict on the
subscriber as the old row won't be present. As of now, we just skip
the update but in the future, we might have some conflict handling
there. If this is true then even if the new row matches the filter,
there is no guarantee that it will be applied on the subscriber-side
unless the old row also matches the filter.
Yeah, it's a valid point.
Sure, there could be a
case where the user might have changed the filter between insert and
update but maybe we can have a separate way to deal with such cases if
required like providing some provision where the user can specify
whether it would like to match old/new row in updates?
Yeah, I think the best way is that users should get an option whether
they want to apply the filter on the old row or on the new row, or
both, in fact, they should be able to apply the different filters on
old and new rows. I have one more thought in mind: currently, we are
providing a filter for the publication table, doesn't it make sense to
provide filters for operations of the publication table? I mean the
different filters for Insert, delete, and the old row of update and
the new row of the update.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Jul 16, 2021 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Jul 16, 2021 at 8:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?I think for insert we are only allowing those rows to replicate which
are matching filter conditions, so if we updating any row then also we
should maintain that sanity right? That means at least on the NEW rows
we should apply the filter, IMHO. Said that, now if there is any row
inserted which were satisfying the filter and replicated, if we update
it with the new value which is not satisfying the filter then it will
not be replicated, I think that makes sense because if an insert is
not sending any row to a replica which is not satisfying the filter
then why update has to do that, right?There is another theory in this regard which is what if the old row
(created by the previous insert) is not sent to the subscriber as that
didn't match the filter but after the update, we decide to send it
because the updated row (new row) matches the filter condition. In
this case, I think it will generate an update conflict on the
subscriber as the old row won't be present. As of now, we just skip
the update but in the future, we might have some conflict handling
there. If this is true then even if the new row matches the filter,
there is no guarantee that it will be applied on the subscriber-side
unless the old row also matches the filter.Yeah, it's a valid point.
Sure, there could be a
case where the user might have changed the filter between insert and
update but maybe we can have a separate way to deal with such cases if
required like providing some provision where the user can specify
whether it would like to match old/new row in updates?Yeah, I think the best way is that users should get an option whether
they want to apply the filter on the old row or on the new row, or
both, in fact, they should be able to apply the different filters on
old and new rows.
I am not so sure about different filters for old and new rows but it
makes sense to by default apply the filter to both old and new rows.
Then also provide a way for user to specify if the filter can be
specified to just old or new row.
I have one more thought in mind: currently, we are
providing a filter for the publication table, doesn't it make sense to
provide filters for operations of the publication table? I mean the
different filters for Insert, delete, and the old row of update and
the new row of the update.
Hmm, I think this sounds a bit of a stretch but if there is any field
use case then we can consider this in the future.
--
With Regards,
Amit Kapila.
On Fri, Jul 16, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am not so sure about different filters for old and new rows but it
makes sense to by default apply the filter to both old and new rows.
Then also provide a way for user to specify if the filter can be
specified to just old or new row.
I'm having some doubts and concerns about what is being suggested.
My current thought and opinion is that the row filter should
(initially, or at least by default) specify the condition of the row
data at the publication boundary (i.e. what is actually sent to and
received by the subscriber). That means for UPDATE, I think that the
filter should operate on the new value.
This has the clear advantage of knowing (from the WHERE expression)
what restrictions are placed on the data that is actually published
and what subscribers will actually receive. So it's more predictable.
If we filter on OLD rows, then we would need to know exactly what is
updated by the UPDATE in order to know what is actually published (for
example, the UPDATE could modify the columns being checked in the
publication WHERE expression).
I'm not saying that's wrong, or a bad idea, but it's more complicated
and potentially confusing. Maybe there could be an option for it.
Also, even if we allowed OLD/NEW to be specified in the WHERE
expression, OLD wouldn't make sense for INSERT and NEW wouldn't make
sense for DELETE, so one WHERE expression with OLD/NEW references
wouldn't seem valid to cover all operations INSERT, UPDATE and DELETE.
I think that was what Dilip was essentially referring to, with his
suggestion of using different filters for different operations (though
I think that may be going too far for the initial implementation).
Regards,
Greg Nancarrow
Fujitsu Australia
On 7/16/21 5:26 AM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Is there some reasonable rule which of the old/new tuples (or both) to
use for the WHERE condition? Or maybe it'd be handy to allow referencing
OLD/NEW as in triggers?I think for insert we are only allowing those rows to replicate which
are matching filter conditions, so if we updating any row then also we
should maintain that sanity right? That means at least on the NEW rows
we should apply the filter, IMHO. Said that, now if there is any row
inserted which were satisfying the filter and replicated, if we update
it with the new value which is not satisfying the filter then it will
not be replicated, I think that makes sense because if an insert is
not sending any row to a replica which is not satisfying the filter
then why update has to do that, right?There is another theory in this regard which is what if the old row
(created by the previous insert) is not sent to the subscriber as that
didn't match the filter but after the update, we decide to send it
because the updated row (new row) matches the filter condition. In
this case, I think it will generate an update conflict on the
subscriber as the old row won't be present. As of now, we just skip
the update but in the future, we might have some conflict handling
there.
Right.
If this is true then even if the new row matches the filter,
there is no guarantee that it will be applied on the subscriber-side
unless the old row also matches the filter. Sure, there could be a > case where the user might have changed the filter between insert and
update but maybe we can have a separate way to deal with such cases if
required like providing some provision where the user can specify
whether it would like to match old/new row in updates?
I think the best we can do for now is to document this. AFAICS it can't
be solved without a conflict resolution that would turn the UPDATE to
INSERT. And that would require REPLICA IDENTITY FULL, otherwise the
UPDATE would not have data for all the columns.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2021-Jul-16, Greg Nancarrow wrote:
On Fri, Jul 16, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am not so sure about different filters for old and new rows but it
makes sense to by default apply the filter to both old and new rows.
Then also provide a way for user to specify if the filter can be
specified to just old or new row.I'm having some doubts and concerns about what is being suggested.
Yeah. I think the idea that some updates fail to reach the replica,
leaving the downstream database in a different state than it would be if
those updates had reached it, is unsettling. It makes me wish we raised
an error at UPDATE time if both rows would not pass the filter test in
the same way -- that is, if the old row passes the filter, then the new
row must be a pass as well.
Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.
--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
"You're _really_ hosed if the person doing the hiring doesn't understand
relational systems: you end up with a whole raft of programmers, none of
whom has had a Date with the clue stick." (Andrew Sullivan)
On Sat, Jul 17, 2021 at 3:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jul-16, Greg Nancarrow wrote:
On Fri, Jul 16, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am not so sure about different filters for old and new rows but it
makes sense to by default apply the filter to both old and new rows.
Then also provide a way for user to specify if the filter can be
specified to just old or new row.I'm having some doubts and concerns about what is being suggested.
Yeah. I think the idea that some updates fail to reach the replica,
leaving the downstream database in a different state than it would be if
those updates had reached it, is unsettling. It makes me wish we raised
an error at UPDATE time if both rows would not pass the filter test in
the same way -- that is, if the old row passes the filter, then the new
row must be a pass as well.
Hmm, do you mean to say that raise an error in walsender while
decoding if old or new doesn't match filter clause? How would
walsender come out of that error? Even, if seeing the error user
changed the filter clause for publication, I think it would still see
the old ones due to historical snapshot and keep on getting the same
error. One idea could be that we use the current snapshot to read the
publications catalog table, then the user would probably change the
filter or do something to move forward from this error. The other
options could be:
a. Just log it and move to the next row
b. send to stats collector some info about this which can be displayed
in a view and then move ahead
c. just skip it like any other row that doesn't match the filter clause.
I am not sure if there is any use of sending a row if one of the
old/new rows doesn't match the filter. Because if the old row doesn't
match but the new one matches the criteria, we will anyway just throw
such a row on the subscriber instead of applying it. OTOH, if old
matches but new doesn't match then it probably doesn't fit the analogy
that new rows should behave similarly to Inserts. I am of opinion that
we should do either (a) or (c) when one of the old or new rows doesn't
match the filter clause.
Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.
I guess such things should be handled via conflict resolution on the
subscriber side.
--
With Regards,
Amit Kapila.
On Mon, Jul 19, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
a. Just log it and move to the next row
b. send to stats collector some info about this which can be displayed
in a view and then move ahead
c. just skip it like any other row that doesn't match the filter clause.I am not sure if there is any use of sending a row if one of the
old/new rows doesn't match the filter. Because if the old row doesn't
match but the new one matches the criteria, we will anyway just throw
such a row on the subscriber instead of applying it.
But at some time that will be true even if we skip the row based on
(a) or (c) right. Suppose the OLD row was not satisfying the
condition but the NEW row is satisfying the condition, now even if we
skip this operation then in the next operation on the same row even if
both OLD and NEW rows are satisfying the filter the operation will
just be dropped by the subscriber right? because we did not send the
previous row when it first updated to value which were satisfying the
condition. So basically, any row is inserted which did not satisfy
the condition first then post that no matter how many updates we do to
that row either it will be skipped by the publisher because the OLD
row was not satisfying the condition or it will be skipped by the
subscriber as there was no matching row.
Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.
Yeah, I think this is the best way to keep the data consistent.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 14, 2021 at 8:03 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/14/21 2:50 PM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
I think apart from the above, it might be good if we can find what
some other databases does in this regard?Yeah, that might tell us what the users would like to do with it. I did
some quick search, but haven't found much :-( The one thing I found is
that Debezium [1] allows accessing both the "old" and "new" rows through
value.before and value.after, and use both for filtering.
Okay, but does it apply a filter to both rows for an Update event?
[1]
https://wanna-joke.com/wp-content/uploads/2015/01/german-translation-comics-science.jpg
This link doesn't provide Debezium information, seems like a typo.
--
With Regards,
Amit Kapila.
On 7/19/21 1:00 PM, Dilip Kumar wrote:
On Mon, Jul 19, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
a. Just log it and move to the next row
b. send to stats collector some info about this which can be displayed
in a view and then move ahead
c. just skip it like any other row that doesn't match the filter clause.I am not sure if there is any use of sending a row if one of the
old/new rows doesn't match the filter. Because if the old row doesn't
match but the new one matches the criteria, we will anyway just throw
such a row on the subscriber instead of applying it.But at some time that will be true even if we skip the row based on
(a) or (c) right. Suppose the OLD row was not satisfying the
condition but the NEW row is satisfying the condition, now even if we
skip this operation then in the next operation on the same row even if
both OLD and NEW rows are satisfying the filter the operation will
just be dropped by the subscriber right? because we did not send the
previous row when it first updated to value which were satisfying the
condition. So basically, any row is inserted which did not satisfy
the condition first then post that no matter how many updates we do to
that row either it will be skipped by the publisher because the OLD
row was not satisfying the condition or it will be skipped by the
subscriber as there was no matching row.
I have a feeling it's getting overly complicated, to the extent that
it'll be hard to explain to users and reason about. I don't think
there's a "perfect" solution for cases when the filter expression gives
different answers for old/new row - it'll always be surprising for some
users :-(
So maybe the best thing is to stick to the simple approach already used
e.g. by pglogical, which simply user the new row when available (insert,
update) and old one for deletes.
I think that behaves more or less sensibly and it's easy to explain.
All the other things (e.g. turning UPDATE to INSERT, advanced conflict
resolution etc.) will require a lot of other stuff, and I see them as
improvements of this simple approach.
Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.Yeah, I think this is the best way to keep the data consistent.
It'd also require REPLICA IDENTITY FULL, which seems like it'd add a
rather significant overhead.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 7/19/21 1:30 PM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 8:03 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 7/14/21 2:50 PM, Amit Kapila wrote:
On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
I think apart from the above, it might be good if we can find what
some other databases does in this regard?Yeah, that might tell us what the users would like to do with it. I did
some quick search, but haven't found much :-( The one thing I found is
that Debezium [1] allows accessing both the "old" and "new" rows through
value.before and value.after, and use both for filtering.Okay, but does it apply a filter to both rows for an Update event?
[1]
https://wanna-joke.com/wp-content/uploads/2015/01/german-translation-comics-science.jpgThis link doesn't provide Debezium information, seems like a typo.
Uh, yeah - I copied a different link. I meant to send this one:
https://debezium.io/documentation/reference/configuration/filtering.html
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Jul 19, 2021 at 11:32 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
I have a feeling it's getting overly complicated, to the extent that
it'll be hard to explain to users and reason about. I don't think
there's a "perfect" solution for cases when the filter expression gives
different answers for old/new row - it'll always be surprising for some
users :-(So maybe the best thing is to stick to the simple approach already used
e.g. by pglogical, which simply user the new row when available (insert,
update) and old one for deletes.I think that behaves more or less sensibly and it's easy to explain.
All the other things (e.g. turning UPDATE to INSERT, advanced conflict
resolution etc.) will require a lot of other stuff, and I see them as
improvements of this simple approach.
+1
My thoughts on this are very similar.
Regards,
Greg Nancarrow
Fujitsu Australia
Hi,
I am interested in this feature and took a quick a look at the patch.
Here are a few comments.
(1)
+ appendStringInfo(&cmd, "%s", q);
We'd better use appendStringInfoString(&cmd, q);
(2)
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
Is it better to invoke eval_const_expressions or canonicalize_qual here to
simplify the expression ?
(3)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
we'd better use appendPQExpBufferStr instead of appendPQExpBuffer here.
(4)
nodeTag(expr) == T_FuncCall)
It might looks clearer to use IsA(expr, FuncCall) here.
Best regards,
Houzj
On 2021-Jul-19, Tomas Vondra wrote:
I have a feeling it's getting overly complicated, to the extent that
it'll be hard to explain to users and reason about. I don't think
there's a "perfect" solution for cases when the filter expression gives
different answers for old/new row - it'll always be surprising for some
users :-(So maybe the best thing is to stick to the simple approach already used
e.g. by pglogical, which simply user the new row when available (insert,
update) and old one for deletes.
OK, no objection to that plan.
--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
"No es bueno caminar con un hombre muerto"
On Mon, Jul 19, 2021 at 4:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Jul 19, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.Yeah, I think this is the best way to keep the data consistent.
Today, while studying the behavior of this particular operation in
other databases, I found that IBM's InfoSphere Data Replication does
exactly this. See [1]https://www.ibm.com/docs/en/idr/11.4.0?topic=rows-search-conditions. I think there is a merit if want to follow this
idea.
[1]: https://www.ibm.com/docs/en/idr/11.4.0?topic=rows-search-conditions
--
With Regards,
Amit Kapila.
On Mon, Jul 19, 2021 at 7:02 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/19/21 1:00 PM, Dilip Kumar wrote:
On Mon, Jul 19, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
a. Just log it and move to the next row
b. send to stats collector some info about this which can be displayed
in a view and then move ahead
c. just skip it like any other row that doesn't match the filter clause.I am not sure if there is any use of sending a row if one of the
old/new rows doesn't match the filter. Because if the old row doesn't
match but the new one matches the criteria, we will anyway just throw
such a row on the subscriber instead of applying it.But at some time that will be true even if we skip the row based on
(a) or (c) right. Suppose the OLD row was not satisfying the
condition but the NEW row is satisfying the condition, now even if we
skip this operation then in the next operation on the same row even if
both OLD and NEW rows are satisfying the filter the operation will
just be dropped by the subscriber right? because we did not send the
previous row when it first updated to value which were satisfying the
condition. So basically, any row is inserted which did not satisfy
the condition first then post that no matter how many updates we do to
that row either it will be skipped by the publisher because the OLD
row was not satisfying the condition or it will be skipped by the
subscriber as there was no matching row.I have a feeling it's getting overly complicated, to the extent that
it'll be hard to explain to users and reason about. I don't think
there's a "perfect" solution for cases when the filter expression gives
different answers for old/new row - it'll always be surprising for some
users :-(
It is possible but OTOH, the three replication solutions (Debezium,
Oracle, IBM's InfoSphere Data Replication) which have this feature
seems to filter based on both old and new rows in one or another way.
Also, I am not sure if the simple approach of just filter based on the
new row is very clear because it can also confuse users in a way that
even if all the new rows matches the filters, they don't see anything
on the subscriber and in fact, that can cause a lot of network
overhead without any gain.
So maybe the best thing is to stick to the simple approach already used
e.g. by pglogical, which simply user the new row when available (insert,
update) and old one for deletes.I think that behaves more or less sensibly and it's easy to explain.
Okay, if nothing better comes up, then we can fall back to this option.
All the other things (e.g. turning UPDATE to INSERT, advanced conflict
resolution etc.) will require a lot of other stuff,
I have not evaluated this yet but I think spending some time thinking
about turning Update to Insert/Delete (yesterday's suggestion by
Alvaro) might be worth especially as that seems to be followed by some
other replication solution as well.
and I see them as
improvements of this simple approach.Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.Yeah, I think this is the best way to keep the data consistent.
It'd also require REPLICA IDENTITY FULL, which seems like it'd add a
rather significant overhead.
Why? I think it would just need similar restrictions as we are
planning for Delete operation such that filter columns must be either
present in primary or replica identity columns.
--
With Regards,
Amit Kapila.
On Tue, Jul 20, 2021 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Today, while studying the behavior of this particular operation in
other databases, I found that IBM's InfoSphere Data Replication does
exactly this. See [1]. I think there is a merit if want to follow this
idea.
So in this model (after initial sync of rows according to the filter),
for UPDATE, the OLD row is checked against the WHERE clause, to know
if the row had been previously published. If it hadn't, and the NEW
row satisfies the WHERE clause, then it needs to be published as an
INSERT. If it had been previously published, but the NEW row doesn't
satisfy the WHERE condition, then it needs to be published as a
DELETE. Otherwise, if both OLD and NEW rows satisfy the WHERE clause,
it needs to be published as an UPDATE.
At least, that seems to be the model when the WHERE clause refers to
the NEW (updated) values, as used in most of their samples (i.e. in
that database "the current log record", indicated by a ":" prefix on
the column name).
I think that allowing the OLD values ("old log record") to be
referenced in the WHERE clause, as that model does, could be
potentially confusing.
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Jul 20, 2021 at 11:38 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Jul 20, 2021 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Today, while studying the behavior of this particular operation in
other databases, I found that IBM's InfoSphere Data Replication does
exactly this. See [1]. I think there is a merit if want to follow this
idea.So in this model (after initial sync of rows according to the filter),
for UPDATE, the OLD row is checked against the WHERE clause, to know
if the row had been previously published. If it hadn't, and the NEW
row satisfies the WHERE clause, then it needs to be published as an
INSERT. If it had been previously published, but the NEW row doesn't
satisfy the WHERE condition, then it needs to be published as a
DELETE. Otherwise, if both OLD and NEW rows satisfy the WHERE clause,
it needs to be published as an UPDATE.
Yeah, this is what I also understood.
At least, that seems to be the model when the WHERE clause refers to
the NEW (updated) values, as used in most of their samples (i.e. in
that database "the current log record", indicated by a ":" prefix on
the column name).
I think that allowing the OLD values ("old log record") to be
referenced in the WHERE clause, as that model does, could be
potentially confusing.
I think in terms of referring to old and new rows, we already have
terminology which we used at various other similar places. See Create
Rule docs [1]https://www.postgresql.org/docs/devel/sql-createrule.html. For where clause, it says "Within condition and
command, the special table names NEW and OLD can be used to refer to
values in the referenced table. NEW is valid in ON INSERT and ON
UPDATE rules to refer to the new row being inserted or updated. OLD is
valid in ON UPDATE and ON DELETE rules to refer to the existing row
being updated or deleted.". We need similar things for the WHERE
clause in publication if we want special syntax to refer to old and
new rows.
I think if we use some existing way to refer to old/new values then it
shouldn't be confusing to users.
[1]: https://www.postgresql.org/docs/devel/sql-createrule.html
--
With Regards,
Amit Kapila.
On Tue, Jul 20, 2021 at 9:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 19, 2021 at 4:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Jul 19, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.Yeah, I think this is the best way to keep the data consistent.
Today, while studying the behavior of this particular operation in
other databases, I found that IBM's InfoSphere Data Replication does
exactly this. See [1]. I think there is a merit if want to follow this
idea.
As per my initial analysis, there shouldn't be much difficulty in
implementing this behavior. We need to change the filter API
(pgoutput_row_filter) such that it tells us whether the filter is
satisfied by the old row, new row or both and then the caller should
be able to make a decision based on that. I think that should be
sufficient to turn update to insert/delete when required. I might be
missing something here but this doesn't appear to require any drastic
changes in the patch.
--
With Regards,
Amit Kapila.
On Wed, Jul 14, 2021 at 2:08 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Jul 13, 2021, at 12:25 AM, Peter Smith wrote:
I have reviewed the latest v18 patch. Below are some more review
comments and patches.Peter, thanks for quickly check the new patch. I'm attaching a new patch (v19).
The latest patch doesn't apply cleanly. Can you please rebase it and
see if you can address some simpler comments till we reach a consensus
on some of the remaining points?
--
With Regards,
Amit Kapila.
On 7/20/21 7:23 AM, Amit Kapila wrote:
On Mon, Jul 19, 2021 at 7:02 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 7/19/21 1:00 PM, Dilip Kumar wrote:
On Mon, Jul 19, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
a. Just log it and move to the next row
b. send to stats collector some info about this which can be displayed
in a view and then move ahead
c. just skip it like any other row that doesn't match the filter clause.I am not sure if there is any use of sending a row if one of the
old/new rows doesn't match the filter. Because if the old row doesn't
match but the new one matches the criteria, we will anyway just throw
such a row on the subscriber instead of applying it.But at some time that will be true even if we skip the row based on
(a) or (c) right. Suppose the OLD row was not satisfying the
condition but the NEW row is satisfying the condition, now even if we
skip this operation then in the next operation on the same row even if
both OLD and NEW rows are satisfying the filter the operation will
just be dropped by the subscriber right? because we did not send the
previous row when it first updated to value which were satisfying the
condition. So basically, any row is inserted which did not satisfy
the condition first then post that no matter how many updates we do to
that row either it will be skipped by the publisher because the OLD
row was not satisfying the condition or it will be skipped by the
subscriber as there was no matching row.I have a feeling it's getting overly complicated, to the extent that
it'll be hard to explain to users and reason about. I don't think
there's a "perfect" solution for cases when the filter expression gives
different answers for old/new row - it'll always be surprising for some
users :-(It is possible but OTOH, the three replication solutions (Debezium,
Oracle, IBM's InfoSphere Data Replication) which have this feature
seems to filter based on both old and new rows in one or another way.
Also, I am not sure if the simple approach of just filter based on the
new row is very clear because it can also confuse users in a way that
even if all the new rows matches the filters, they don't see anything
on the subscriber and in fact, that can cause a lot of network
overhead without any gain.
True. My point is that it's easier to explain than when using some
combination of old/new row, and theapproach "replicate if the filter
matches both rows" proposed in this thread would be confusing too.
If the subscriber database can be modified, we kinda already have this
issue already - the row can be deleted, and all UPDATEs will be lost.
Yes, for read-only replicas that won't happen, but I think we're moving
to use cases more advanced than that.
I think there are only two ways to *guarantee* this does not happen:
* prohibit updates of columns referenced in row filters
* some sort of conflict resolution, turning UPDATE to INSERT etc.
So maybe the best thing is to stick to the simple approach already used
e.g. by pglogical, which simply user the new row when available (insert,
update) and old one for deletes.I think that behaves more or less sensibly and it's easy to explain.
Okay, if nothing better comes up, then we can fall back to this option.
All the other things (e.g. turning UPDATE to INSERT, advanced conflict
resolution etc.) will require a lot of other stuff,I have not evaluated this yet but I think spending some time thinking
about turning Update to Insert/Delete (yesterday's suggestion by
Alvaro) might be worth especially as that seems to be followed by some
other replication solution as well.
I think that requires quite a bit of infrastructure, and I'd bet we'll
need to handle other types of conflicts too. I don't have a clear
opinion if that's required to get this patch working - I'd try getting
the simplest implementation with reasonable behavior, with those more
advanced things as future enhancements.
and I see them as
improvements of this simple approach.Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.Yeah, I think this is the best way to keep the data consistent.
It'd also require REPLICA IDENTITY FULL, which seems like it'd add a
rather significant overhead.Why? I think it would just need similar restrictions as we are
planning for Delete operation such that filter columns must be either
present in primary or replica identity columns.
How else would you turn UPDATE to INSERT? For UPDATE we only send the
identity columns and modified columns, and the decision happens on the
subscriber. So we need to send everything if there's a risk we'll need
those columns. But it's early I only had one coffee, so I may be missing
something glaringly obvious.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Jul 20, 2021 at 2:39 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 7/20/21 7:23 AM, Amit Kapila wrote:
On Mon, Jul 19, 2021 at 7:02 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:So maybe the best thing is to stick to the simple approach already used
e.g. by pglogical, which simply user the new row when available (insert,
update) and old one for deletes.I think that behaves more or less sensibly and it's easy to explain.
Okay, if nothing better comes up, then we can fall back to this option.
All the other things (e.g. turning UPDATE to INSERT, advanced conflict
resolution etc.) will require a lot of other stuff,I have not evaluated this yet but I think spending some time thinking
about turning Update to Insert/Delete (yesterday's suggestion by
Alvaro) might be worth especially as that seems to be followed by some
other replication solution as well.I think that requires quite a bit of infrastructure, and I'd bet we'll
need to handle other types of conflicts too.
Hmm, I don't see why we need any additional infrastructure here if we
do this at the publisher. I think this could be done without many
changes to the patch as explained in one of my previous emails [1]/messages/by-id/CAA4eK1+AXEd5bO-qPp6L9Ptckk09nbWvP8V7q5UW4hg+kHjXwQ@mail.gmail.com.
I don't have a clear
opinion if that's required to get this patch working - I'd try getting
the simplest implementation with reasonable behavior, with those more
advanced things as future enhancements.and I see them as
improvements of this simple approach.Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.Yeah, I think this is the best way to keep the data consistent.
It'd also require REPLICA IDENTITY FULL, which seems like it'd add a
rather significant overhead.Why? I think it would just need similar restrictions as we are
planning for Delete operation such that filter columns must be either
present in primary or replica identity columns.How else would you turn UPDATE to INSERT? For UPDATE we only send the
identity columns and modified columns, and the decision happens on the
subscriber.
Hmm, we log the entire new tuple and replica identity columns for the
old tuple in WAL for Update. And, we are going to use a new tuple for
Insert, so we have everything we need.
[1]: /messages/by-id/CAA4eK1+AXEd5bO-qPp6L9Ptckk09nbWvP8V7q5UW4hg+kHjXwQ@mail.gmail.com
--
With Regards,
Amit Kapila.
On Tue, Jul 20, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Why? I think it would just need similar restrictions as we are
planning for Delete operation such that filter columns must be either
present in primary or replica identity columns.How else would you turn UPDATE to INSERT? For UPDATE we only send the
identity columns and modified columns, and the decision happens on the
subscriber.Hmm, we log the entire new tuple and replica identity columns for the
old tuple in WAL for Update. And, we are going to use a new tuple for
Insert, so we have everything we need.
But for making that decision we need to apply the filter on the old
rows as well right. So if we want to apply the filter on the old rows
then either the filter should only be on the replica identity key or
we need to use REPLICA IDENTITY FULL. I think that is what Tomas
wants to point out.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jul 20, 2021 at 3:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 20, 2021 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Why? I think it would just need similar restrictions as we are
planning for Delete operation such that filter columns must be either
present in primary or replica identity columns.How else would you turn UPDATE to INSERT? For UPDATE we only send the
identity columns and modified columns, and the decision happens on the
subscriber.Hmm, we log the entire new tuple and replica identity columns for the
old tuple in WAL for Update. And, we are going to use a new tuple for
Insert, so we have everything we need.But for making that decision we need to apply the filter on the old
rows as well right. So if we want to apply the filter on the old rows
then either the filter should only be on the replica identity key or
we need to use REPLICA IDENTITY FULL. I think that is what Tomas
wants to point out.
I have already mentioned that for Updates the filter needs criteria
similar to Deletes. This is exactly the requirement for Delete as
well.
--
With Regards,
Amit Kapila.
On 7/20/21 11:42 AM, Amit Kapila wrote:
On Tue, Jul 20, 2021 at 2:39 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 7/20/21 7:23 AM, Amit Kapila wrote:
On Mon, Jul 19, 2021 at 7:02 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:So maybe the best thing is to stick to the simple approach already used
e.g. by pglogical, which simply user the new row when available (insert,
update) and old one for deletes.I think that behaves more or less sensibly and it's easy to explain.
Okay, if nothing better comes up, then we can fall back to this option.
All the other things (e.g. turning UPDATE to INSERT, advanced conflict
resolution etc.) will require a lot of other stuff,I have not evaluated this yet but I think spending some time thinking
about turning Update to Insert/Delete (yesterday's suggestion by
Alvaro) might be worth especially as that seems to be followed by some
other replication solution as well.I think that requires quite a bit of infrastructure, and I'd bet we'll
need to handle other types of conflicts too.Hmm, I don't see why we need any additional infrastructure here if we
do this at the publisher. I think this could be done without many
changes to the patch as explained in one of my previous emails [1].
Oh, I see. I've been thinking about doing the "usual" conflict
resolution on the subscriber side. I'm not sure about doing this on the
publisher ...
I don't have a clear
opinion if that's required to get this patch working - I'd try getting
the simplest implementation with reasonable behavior, with those more
advanced things as future enhancements.and I see them as
improvements of this simple approach.Maybe a second option is to have replication change any UPDATE into
either an INSERT or a DELETE, if the old or the new row do not pass the
filter, respectively. That way, the databases would remain consistent.Yeah, I think this is the best way to keep the data consistent.
It'd also require REPLICA IDENTITY FULL, which seems like it'd add a
rather significant overhead.Why? I think it would just need similar restrictions as we are
planning for Delete operation such that filter columns must be either
present in primary or replica identity columns.How else would you turn UPDATE to INSERT? For UPDATE we only send the
identity columns and modified columns, and the decision happens on the
subscriber.Hmm, we log the entire new tuple and replica identity columns for the
old tuple in WAL for Update. And, we are going to use a new tuple for
Insert, so we have everything we need.
Do we log the TOAST-ed values that were not updated?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Do we log the TOAST-ed values that were not updated?
No, we don't, I have submitted a patch sometime back to fix that [1]https://commitfest.postgresql.org/33/3162/
[1]: https://commitfest.postgresql.org/33/3162/
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jul 20, 2021 at 6:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think in terms of referring to old and new rows, we already have
terminology which we used at various other similar places. See Create
Rule docs [1]. For where clause, it says "Within condition and
command, the special table names NEW and OLD can be used to refer to
values in the referenced table. NEW is valid in ON INSERT and ON
UPDATE rules to refer to the new row being inserted or updated. OLD is
valid in ON UPDATE and ON DELETE rules to refer to the existing row
being updated or deleted.". We need similar things for the WHERE
clause in publication if we want special syntax to refer to old and
new rows.
I have no doubt we COULD allow references to OLD and NEW in the WHERE
clause, but do we actually want to?
This is what I thought could cause confusion, when mixed with the
model that I previously described.
It's not entirely clear to me exactly how it works, when the WHERE
clause is applied to the OLD and NEW rows, when the WHERE condition
itself can refer to OLD and/or NEW (coupled with the fact that NEW
doesn't make sense for DELETE and OLD doesn't make sense for INSERT).
Combine that with the fact that a publication can have multiple tables
each with their own WHERE clause, and tables can be dropped/(re)added
to the publication with a different WHERE clause, and it starts to get
a little complicated working out exactly what the result should be.
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Jul 20, 2021 at 5:13 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Jul 20, 2021 at 6:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think in terms of referring to old and new rows, we already have
terminology which we used at various other similar places. See Create
Rule docs [1]. For where clause, it says "Within condition and
command, the special table names NEW and OLD can be used to refer to
values in the referenced table. NEW is valid in ON INSERT and ON
UPDATE rules to refer to the new row being inserted or updated. OLD is
valid in ON UPDATE and ON DELETE rules to refer to the existing row
being updated or deleted.". We need similar things for the WHERE
clause in publication if we want special syntax to refer to old and
new rows.I have no doubt we COULD allow references to OLD and NEW in the WHERE
clause, but do we actually want to?
This is what I thought could cause confusion, when mixed with the
model that I previously described.
It's not entirely clear to me exactly how it works, when the WHERE
clause is applied to the OLD and NEW rows, when the WHERE condition
itself can refer to OLD and/or NEW (coupled with the fact that NEW
doesn't make sense for DELETE and OLD doesn't make sense for INSERT).
It is not new, the same is true when they are used in RULES and
probably in other places where we use them.
--
With Regards,
Amit Kapila.
On Tue, Jul 20, 2021 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Do we log the TOAST-ed values that were not updated?
No, we don't, I have submitted a patch sometime back to fix that [1]
That patch seems to log WAL for key unchanged columns. What about if
unchanged non-key columns? Do they get logged as part of the new tuple
or is there some other way we can get those? If not, then we need to
probably think of restricting filter clause in some way.
--
With Regards,
Amit Kapila.
On Thu, Jul 22, 2021 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jul 20, 2021 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Do we log the TOAST-ed values that were not updated?
No, we don't, I have submitted a patch sometime back to fix that [1]
That patch seems to log WAL for key unchanged columns. What about if
unchanged non-key columns? Do they get logged as part of the new tuple
or is there some other way we can get those? If not, then we need to
probably think of restricting filter clause in some way.
But what sort of restrictions? I mean we can not put based on data
type right that will be too restrictive, other option is only to allow
replica identity keys columns in the filter condition?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Jul 22, 2021 at 8:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jul 22, 2021 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jul 20, 2021 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Do we log the TOAST-ed values that were not updated?
No, we don't, I have submitted a patch sometime back to fix that [1]
That patch seems to log WAL for key unchanged columns. What about if
unchanged non-key columns? Do they get logged as part of the new tuple
or is there some other way we can get those? If not, then we need to
probably think of restricting filter clause in some way.But what sort of restrictions? I mean we can not put based on data
type right that will be too restrictive,
Yeah, data type restriction sounds too restrictive and unless the data
is toasted, the data will be anyway available. I think such kind of
restriction should be the last resort but let's try to see if we can
do something better.
other option is only to allow
replica identity keys columns in the filter condition?
Yes, that is what I had in mind because if key column(s) is changed
then we will have data for both old and new tuples. But if it is not
changed then we will have it probably for the old tuple unless we
decide to fix the bug you mentioned in a different way in which case
we might either need to log it for the purpose of this feature (but
that will be any way for HEAD) or need to come up with some other
solution here. I think we can't even fetch such columns data during
decoding because we have catalog-only historic snapshots here. Do you
have any better ideas?
--
With Regards,
Amit Kapila.
On Fri, Jul 23, 2021 at 8:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 22, 2021 at 8:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jul 22, 2021 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jul 20, 2021 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Do we log the TOAST-ed values that were not updated?
No, we don't, I have submitted a patch sometime back to fix that [1]
That patch seems to log WAL for key unchanged columns. What about if
unchanged non-key columns? Do they get logged as part of the new tuple
or is there some other way we can get those? If not, then we need to
probably think of restricting filter clause in some way.But what sort of restrictions? I mean we can not put based on data
type right that will be too restrictive,Yeah, data type restriction sounds too restrictive and unless the data
is toasted, the data will be anyway available. I think such kind of
restriction should be the last resort but let's try to see if we can
do something better.other option is only to allow
replica identity keys columns in the filter condition?Yes, that is what I had in mind because if key column(s) is changed
then we will have data for both old and new tuples. But if it is not
changed then we will have it probably for the old tuple unless we
decide to fix the bug you mentioned in a different way in which case
we might either need to log it for the purpose of this feature (but
that will be any way for HEAD) or need to come up with some other
solution here. I think we can't even fetch such columns data during
decoding because we have catalog-only historic snapshots here. Do you
have any better ideas?
BTW, I wonder how pglogical can handle this because if these unchanged
toasted values are not logged in WAL for the new tuple then how the
comparison for such columns will work? Either they are forcing WAL in
some way or don't allow WHERE clause on such columns or maybe they
have dealt with it in some other way unless they are unaware of this
problem.
--
With Regards,
Amit Kapila.
On Fri, Jul 23, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jul 23, 2021 at 8:29 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Thu, Jul 22, 2021 at 8:06 PM Dilip Kumar <dilipbalaut@gmail.com>
wrote:
On Thu, Jul 22, 2021 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Tue, Jul 20, 2021 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com>
wrote:
On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Do we log the TOAST-ed values that were not updated?
No, we don't, I have submitted a patch sometime back to fix that
[1]
That patch seems to log WAL for key unchanged columns. What about if
unchanged non-key columns? Do they get logged as part of the newtuple
or is there some other way we can get those? If not, then we need to
probably think of restricting filter clause in some way.But what sort of restrictions? I mean we can not put based on data
type right that will be too restrictive,Yeah, data type restriction sounds too restrictive and unless the data
is toasted, the data will be anyway available. I think such kind of
restriction should be the last resort but let's try to see if we can
do something better.other option is only to allow
replica identity keys columns in the filter condition?Yes, that is what I had in mind because if key column(s) is changed
then we will have data for both old and new tuples. But if it is not
changed then we will have it probably for the old tuple unless we
decide to fix the bug you mentioned in a different way in which case
we might either need to log it for the purpose of this feature (but
that will be any way for HEAD) or need to come up with some other
solution here. I think we can't even fetch such columns data during
decoding because we have catalog-only historic snapshots here. Do you
have any better ideas?BTW, I wonder how pglogical can handle this because if these unchanged
toasted values are not logged in WAL for the new tuple then how the
comparison for such columns will work? Either they are forcing WAL in
some way or don't allow WHERE clause on such columns or maybe they
have dealt with it in some other way unless they are unaware of this
problem.
The column comparison for row filtering happens before the unchanged toast
columns are filtered. Unchanged toast columns are filtered just before
writing the tuple
to output stream. I think this is the case both for pglogical and the
proposed patch.
So, I can't see why the not logging of unchanged toast columns would be a
problem
for row filtering. Am I missing something?
Thank you,
Rahila Syed
On Fri, Jul 23, 2021 at 2:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
On Fri, Jul 23, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jul 23, 2021 at 8:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 22, 2021 at 8:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jul 22, 2021 at 5:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jul 20, 2021 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 20, 2021 at 3:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Do we log the TOAST-ed values that were not updated?
No, we don't, I have submitted a patch sometime back to fix that [1]
That patch seems to log WAL for key unchanged columns. What about if
unchanged non-key columns? Do they get logged as part of the new tuple
or is there some other way we can get those? If not, then we need to
probably think of restricting filter clause in some way.But what sort of restrictions? I mean we can not put based on data
type right that will be too restrictive,Yeah, data type restriction sounds too restrictive and unless the data
is toasted, the data will be anyway available. I think such kind of
restriction should be the last resort but let's try to see if we can
do something better.other option is only to allow
replica identity keys columns in the filter condition?Yes, that is what I had in mind because if key column(s) is changed
then we will have data for both old and new tuples. But if it is not
changed then we will have it probably for the old tuple unless we
decide to fix the bug you mentioned in a different way in which case
we might either need to log it for the purpose of this feature (but
that will be any way for HEAD) or need to come up with some other
solution here. I think we can't even fetch such columns data during
decoding because we have catalog-only historic snapshots here. Do you
have any better ideas?BTW, I wonder how pglogical can handle this because if these unchanged
toasted values are not logged in WAL for the new tuple then how the
comparison for such columns will work? Either they are forcing WAL in
some way or don't allow WHERE clause on such columns or maybe they
have dealt with it in some other way unless they are unaware of this
problem.The column comparison for row filtering happens before the unchanged toast
columns are filtered. Unchanged toast columns are filtered just before writing the tuple
to output stream.
To perform filtering, you need to use the tuple from WAL and that
tuple doesn't seem to have unchanged toast values, so how can we do
filtering? I think it is a good idea to test this once.
--
With Regards,
Amit Kapila.
On July 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jul 23, 2021 at 2:27 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
The column comparison for row filtering happens before the unchanged
toast columns are filtered. Unchanged toast columns are filtered just
before writing the tuple to output stream.To perform filtering, you need to use the tuple from WAL and that tuple doesn't
seem to have unchanged toast values, so how can we do filtering? I think it is a
good idea to test this once.
I agreed.
Currently, both unchanged toasted key column and unchanged toasted non-key
column is not logged. So, we cannot get the toasted value directly for these
columns when doing row filtering.
I tested the current patch for toasted data and found a problem: In the current
patch, it will try to fetch the toast data from toast table when doing row
filtering[1](1)------publisher------ CREATE TABLE toasted_key ( id serial, toasted_key text PRIMARY KEY, toasted_col1 text, toasted_col2 text ); select repeat('9999999999', 200) as tvalue \gset CREATE PUBLICATION pub FOR TABLE toasted_key WHERE (toasted_col2 = :'tvalue'); ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey; ALTER TABLE toasted_key ALTER COLUMN toasted_key SET STORAGE EXTERNAL; ALTER TABLE toasted_key ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL; ALTER TABLE toasted_key ALTER COLUMN toasted_col2 SET STORAGE EXTERNAL; INSERT INTO toasted_key(toasted_key, toasted_col1, toasted_col2) VALUES(repeat('1234567890', 200), repeat('9876543210', 200), repeat('9999999999', 200));. But, it's unsafe to do that in walsender. We can see it use
HISTORIC snapshot in heap_fetch_toast_slice() and also the comments of
init_toast_snapshot() have said "Detoasting *must* happen in the same
transaction that originally fetched the toast pointer.". The toast data could
have been changed when doing row filtering. For exmaple, I tested the following
steps and get an error.
1) UPDATE a nonkey column in publisher.
2) Use debugger to block the walsender process in function
pgoutput_row_filter_exec_expr().
3) Open another psql to connect the publisher, and drop the table which updated
in 1).
4) Unblock the debugger in 2), and then I can see the following error:
---
ERROR: could not read block 0 in file "base/13675/16391"
---
[1]: (1)------publisher------ CREATE TABLE toasted_key ( id serial, toasted_key text PRIMARY KEY, toasted_col1 text, toasted_col2 text ); select repeat('9999999999', 200) as tvalue \gset CREATE PUBLICATION pub FOR TABLE toasted_key WHERE (toasted_col2 = :'tvalue'); ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey; ALTER TABLE toasted_key ALTER COLUMN toasted_key SET STORAGE EXTERNAL; ALTER TABLE toasted_key ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL; ALTER TABLE toasted_key ALTER COLUMN toasted_col2 SET STORAGE EXTERNAL; INSERT INTO toasted_key(toasted_key, toasted_col1, toasted_col2) VALUES(repeat('1234567890', 200), repeat('9876543210', 200), repeat('9999999999', 200));
(1)------publisher------
CREATE TABLE toasted_key (
id serial,
toasted_key text PRIMARY KEY,
toasted_col1 text,
toasted_col2 text
);
select repeat('9999999999', 200) as tvalue \gset
CREATE PUBLICATION pub FOR TABLE toasted_key WHERE (toasted_col2 = :'tvalue');
ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey;
ALTER TABLE toasted_key ALTER COLUMN toasted_key SET STORAGE EXTERNAL;
ALTER TABLE toasted_key ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL;
ALTER TABLE toasted_key ALTER COLUMN toasted_col2 SET STORAGE EXTERNAL;
INSERT INTO toasted_key(toasted_key, toasted_col1, toasted_col2) VALUES(repeat('1234567890', 200), repeat('9876543210', 200), repeat('9999999999', 200));
(2)------subscriber------
CREATE TABLE toasted_key (
id serial,
toasted_key text PRIMARY KEY,
toasted_col1 text,
toasted_col2 text
);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres port=10000' PUBLICATION pub;
(3)------publisher------
UPDATE toasted_key SET toasted_col1 = repeat('1111113113', 200);
Based on the above steps, the row filter will ge through the following path
and fetch toast data in walsender.
------
pgoutput_row_filter_exec_expr
...
texteq
...
text *targ1 = DatumGetTextPP(arg1);
pg_detoast_datum_packed
detoast_attr
------
On Tue, Jul 27, 2021 at 6:21 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
1) UPDATE a nonkey column in publisher.
2) Use debugger to block the walsender process in function
pgoutput_row_filter_exec_expr().
3) Open another psql to connect the publisher, and drop the table which updated
in 1).
4) Unblock the debugger in 2), and then I can see the following error:
---
ERROR: could not read block 0 in file "base/13675/16391"
Yeah, that's a big problem, seems like the expression evaluation
machinery directly going and detoasting the externally stored data
using some random snapshot. Ideally, in walsender we can never
attempt to detoast the data because there is no guarantee that those
data are preserved. Somehow before going to the expression evaluation
machinery, I think we will have to deform that tuple and need to do
something for the externally stored data otherwise it will be very
difficult to control that inside the expression evaluation.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
FYI - v19 --> v20
(Only very minimal changes. Nothing functional)
Changes:
* The v19 patch was broken due to changes of commit [1]https://github.com/postgres/postgres/commit/2b00db4fb0c7f02f000276bfadaab65a14059168 so I have
rebased so the cfbot is happy.
* I also renamed the TAP test 021_row_filter.pl ==> 023_row_filter.pl
because commit [2]https://github.com/postgres/postgres/commit/a8fd13cab0ba815e9925dc9676e6309f699b5f72 already added another TAP test numbered 021.
------
[1]: https://github.com/postgres/postgres/commit/2b00db4fb0c7f02f000276bfadaab65a14059168
[2]: https://github.com/postgres/postgres/commit/a8fd13cab0ba815e9925dc9676e6309f699b5f72
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v20-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v20-0001-Row-filter-for-logical-replication.patchDownload
From 3792f480b941d8b57716bdad152fcb08086f4029 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 27 Jul 2021 17:43:58 +1000
Subject: [PATCH v20] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++++-
src/backend/commands/publicationcmds.c | 112 +++++++----
src/backend/nodes/copyfuncs.c | 14 ++
src/backend/nodes/equalfuncs.c | 12 ++
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 255 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/023_row_filter.pl | 298 ++++++++++++++++++++++++++++
26 files changed, 1048 insertions(+), 75 deletions(-)
create mode 100644 src/test/subscription/t/023_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a..d473af1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..35006d9 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 1433905..e0149e7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03..6057fc3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb..4709a71 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +556,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +596,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +625,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 29020c9..63abfdd 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4840,6 +4840,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5704,6 +5715,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a17620..3ec66c4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2315,6 +2315,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
}
static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
+static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
{
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 10da5c5..fcb9fb3 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9612,7 +9612,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9643,7 +9643,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9669,6 +9669,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb..8fb953b 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..48bdbc3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index e4314af..13fd37d 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -112,6 +121,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -135,7 +146,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -144,6 +155,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -630,6 +648,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -656,7 +817,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -680,8 +841,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -689,6 +848,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -712,6 +881,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -740,6 +915,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -803,7 +984,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1104,9 +1285,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1129,6 +1311,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1140,6 +1324,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1153,6 +1338,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1162,6 +1363,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1221,9 +1425,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1341,6 +1569,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1350,6 +1579,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1367,7 +1598,13 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 90ac445..f4f1298 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4148,6 +4148,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4158,9 +4159,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4169,6 +4177,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4209,6 +4218,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4241,8 +4254,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e..0c31005 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -627,6 +627,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8333558..97250d2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index f7b009e..065d99d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 947660a..ec7710d 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3624,12 +3624,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3642,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0b..319c6bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/023_row_filter.pl b/src/test/subscription/t/023_row_filter.pl
new file mode 100644
index 0000000..0f6d2f0
--- /dev/null
+++ b/src/test/subscription/t/023_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
FYI - v20 --> v21
(Only very minimal changes)
* I noticed that the v20 TAP test (023_row_filter.pl) began failing
due to a recent commit [1]https://github.com/postgres/postgres/commit/201a76183e2056c2217129e12d68c25ec9c559c8, so I have rebased it to keep the cfbot
happy.
------
[1]: https://github.com/postgres/postgres/commit/201a76183e2056c2217129e12d68c25ec9c559c8
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v21-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v21-0001-Row-filter-for-logical-replication.patchDownload
From a9dd1df5a9ab9e22de8a33594e8331e9a9d293d6 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 3 Aug 2021 12:46:27 +1000
Subject: [PATCH v21] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++++-
src/backend/commands/publicationcmds.c | 112 +++++++----
src/backend/nodes/copyfuncs.c | 14 ++
src/backend/nodes/equalfuncs.c | 12 ++
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 255 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/023_row_filter.pl | 298 ++++++++++++++++++++++++++++
26 files changed, 1048 insertions(+), 75 deletions(-)
create mode 100644 src/test/subscription/t/023_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a..d473af1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..35006d9 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 1433905..e0149e7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03..6057fc3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb..4709a71 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +556,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +596,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +625,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 29020c9..63abfdd 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4840,6 +4840,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5704,6 +5715,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a17620..3ec66c4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2315,6 +2315,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
}
static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
+static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
{
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849..96c42d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9620,7 +9620,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9659,7 +9659,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9677,6 +9677,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb..8fb953b 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..48bdbc3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index e4314af..13fd37d 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -112,6 +121,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -135,7 +146,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -144,6 +155,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -630,6 +648,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -656,7 +817,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -680,8 +841,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -689,6 +848,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -712,6 +881,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -740,6 +915,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -803,7 +984,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1104,9 +1285,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1129,6 +1311,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1140,6 +1324,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1153,6 +1338,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1162,6 +1363,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1221,9 +1425,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1341,6 +1569,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1350,6 +1579,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1367,7 +1598,13 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 90ac445..f4f1298 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4148,6 +4148,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4158,9 +4159,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4169,6 +4177,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4209,6 +4218,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4241,8 +4254,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e..0c31005 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -627,6 +627,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8333558..97250d2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index f7b009e..065d99d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248a..d9268a9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3625,12 +3625,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3643,7 +3650,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0b..319c6bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/023_row_filter.pl b/src/test/subscription/t/023_row_filter.pl
new file mode 100644
index 0000000..ca8153e
--- /dev/null
+++ b/src/test/subscription/t/023_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Tue, Jul 27, 2021 at 9:56 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Jul 27, 2021 at 6:21 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:1) UPDATE a nonkey column in publisher.
2) Use debugger to block the walsender process in function
pgoutput_row_filter_exec_expr().
3) Open another psql to connect the publisher, and drop the table which updated
in 1).
4) Unblock the debugger in 2), and then I can see the following error:
---
ERROR: could not read block 0 in file "base/13675/16391"Yeah, that's a big problem, seems like the expression evaluation
machinery directly going and detoasting the externally stored data
using some random snapshot. Ideally, in walsender we can never
attempt to detoast the data because there is no guarantee that those
data are preserved. Somehow before going to the expression evaluation
machinery, I think we will have to deform that tuple and need to do
something for the externally stored data otherwise it will be very
difficult to control that inside the expression evaluation.
True, I think it would be possible after we fix the issue reported in
another thread [1]/messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com where we will log the key values as part of
old_tuple_key for toast tuples even if they are not changed. We can
have a restriction that in the WHERE clause that user can specify only
Key columns for Updates similar to Deletes. Then, we have the data
required for filter columns basically if the toasted key values are
changed, then they will be anyway part of the old and new tuple and if
they are not changed then they will be part of the old tuple. I have
not checked the implementation part of it but theoretically, it seems
possible. If my understanding is correct then it becomes necessary to
solve the other bug [1]/messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com to solve this part of the problem for this
patch. The other possibility is to disallow columns (datatypes) that
can lead to toasted data (at least for Updates) which doesn't sound
like a good idea to me. Do you have any other ideas for this problem?
[1]: /messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
v21 --> v22
(This small change is only to keep the patch up-to-date with HEAD)
Changes:
* A recent commit [1]https://github.com/postgres/postgres/commit/63cf61cdeb7b0450dcf3b2f719c553177bac85a2 added a new TAP subscription test file 023, so
now this patch's test file (previously "023_row_filter.pl") has been
bumped to "024_row_filter.pl".
------
[1]: https://github.com/postgres/postgres/commit/63cf61cdeb7b0450dcf3b2f719c553177bac85a2
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v22-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v22-0001-Row-filter-for-logical-replication.patchDownload
From 948b703451ac0a3f4cb9464901840bee7ebbf706 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 5 Aug 2021 16:34:42 +1000
Subject: [PATCH v22] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++++-
src/backend/commands/publicationcmds.c | 112 +++++++----
src/backend/nodes/copyfuncs.c | 14 ++
src/backend/nodes/equalfuncs.c | 12 ++
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 255 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/024_row_filter.pl | 298 ++++++++++++++++++++++++++++
26 files changed, 1048 insertions(+), 75 deletions(-)
create mode 100644 src/test/subscription/t/024_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a..d473af1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..35006d9 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03..6057fc3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb..4709a71 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +556,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +596,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +625,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 29020c9..63abfdd 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4840,6 +4840,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5704,6 +5715,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a17620..3ec66c4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2315,6 +2315,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
}
static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
+static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
{
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849..96c42d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9620,7 +9620,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9659,7 +9659,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9677,6 +9677,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb..8fb953b 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..48bdbc3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 286119c..4c9c7f6 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, txn, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, txn, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,13 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 90ac445..f4f1298 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4148,6 +4148,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4158,9 +4159,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4169,6 +4177,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4209,6 +4218,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4241,8 +4254,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e..0c31005 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -627,6 +627,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8333558..97250d2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index f7b009e..065d99d 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -491,6 +491,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248a..d9268a9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3625,12 +3625,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3643,7 +3650,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0b..319c6bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/024_row_filter.pl b/src/test/subscription/t/024_row_filter.pl
new file mode 100644
index 0000000..ca8153e
--- /dev/null
+++ b/src/test/subscription/t/024_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v22 --> v23
Changes:
* A rebase was needed (due to commit [1]https://github.com/postgres/postgres/commit/93d573d86571d148e2d14415166ec6981d34ea9d) to keep the patch working with cfbot.
------
[1]: https://github.com/postgres/postgres/commit/93d573d86571d148e2d14415166ec6981d34ea9d
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v23-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v23-0001-Row-filter-for-logical-replication.patchDownload
From 7252c2b6a17f9d05e680f24c3f2d11fb956ab024 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Sun, 8 Aug 2021 15:54:25 +1000
Subject: [PATCH v23] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++++-
src/backend/commands/publicationcmds.c | 112 +++++++----
src/backend/nodes/copyfuncs.c | 14 ++
src/backend/nodes/equalfuncs.c | 12 ++
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 255 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/024_row_filter.pl | 298 ++++++++++++++++++++++++++++
26 files changed, 1048 insertions(+), 75 deletions(-)
create mode 100644 src/test/subscription/t/024_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a..d473af1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..35006d9 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03..6057fc3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb..4709a71 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +556,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +596,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +625,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 29020c9..63abfdd 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4840,6 +4840,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5704,6 +5715,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a17620..3ec66c4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2315,6 +2315,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
}
static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
+static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
{
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849..96c42d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9620,7 +9620,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9659,7 +9659,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9677,6 +9677,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 24268eb..8fb953b 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..48bdbc3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..ef1ba91 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,13 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 90ac445..f4f1298 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4148,6 +4148,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4158,9 +4159,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4169,6 +4177,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4209,6 +4218,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4241,8 +4254,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e..0c31005 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -627,6 +627,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8333558..97250d2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 6a4d82f..56d13ff 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -490,6 +490,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248a..d9268a9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3625,12 +3625,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3643,7 +3650,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0b..319c6bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/024_row_filter.pl b/src/test/subscription/t/024_row_filter.pl
new file mode 100644
index 0000000..ca8153e
--- /dev/null
+++ b/src/test/subscription/t/024_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Tue, Aug 3, 2021 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jul 27, 2021 at 9:56 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Yeah, that's a big problem, seems like the expression evaluation
machinery directly going and detoasting the externally stored data
using some random snapshot. Ideally, in walsender we can never
attempt to detoast the data because there is no guarantee that those
data are preserved. Somehow before going to the expression evaluation
machinery, I think we will have to deform that tuple and need to do
something for the externally stored data otherwise it will be very
difficult to control that inside the expression evaluation.True, I think it would be possible after we fix the issue reported in
another thread [1] where we will log the key values as part of
old_tuple_key for toast tuples even if they are not changed. We can
have a restriction that in the WHERE clause that user can specify only
Key columns for Updates similar to Deletes. Then, we have the data
required for filter columns basically if the toasted key values are
changed, then they will be anyway part of the old and new tuple and if
they are not changed then they will be part of the old tuple.
Right.
I have
not checked the implementation part of it but theoretically, it seems
possible.
Yeah, It would be possible to because at least after fixing [1]/messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com we
would have the required column data. The only thing I am worried
about is while applying the filter on the new tuple the toasted
unchanged key data will not be a part of the new tuple. So we can not
directly call the expression evaluation machinary, basically, somehow
we need to deform the new tuple and then replace the data from the old
tuple before passing it to expression evaluation. Anyways this is an
implementation part so we can look into that while implementing.
If my understanding is correct then it becomes necessary to
solve the other bug [1] to solve this part of the problem for this
patch.
Right.
The other possibility is to disallow columns (datatypes) that
can lead to toasted data (at least for Updates) which doesn't sound
like a good idea to me.
Yeah, that will be a big limitation, then we won't be able to allow
expression on any varlena types.
Do you have any other ideas for this problem?
As of now no other better idea to suggest.
[1]: /messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Jul 12, 2021 at 7:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 1:09 AM Euler Taveira <euler@eulerto.com> wrote:
I did another measure using as baseline the previous patch (v16).
without cache (v16)
---------------------------mean: 1.46 us
stddev: 2.13 us
median: 1.39 us
min-max: [0.69 .. 1456.69] us
percentile(99): 3.15 us
mode: 0.91 uswith cache (v18)
-----------------------mean: 0.63 us
stddev: 1.07 us
median: 0.55 us
min-max: [0.29 .. 844.87] us
percentile(99): 1.38 us
mode: 0.41 usIt represents -57%. It is a really good optimization for just a few extra lines
of code.Good improvement but I think it is better to measure the performance
by using synchronous_replication by setting the subscriber as
standby_synchronous_names, which will provide the overall saving of
time. We can probably see when the timings when no rows are filtered,
when 10% rows are filtered when 30% are filtered and so on.I think the way caching has been done in the patch is a bit
inefficient. Basically, it always invalidates and rebuilds the
expressions even though some unrelated operation has happened on
publication. For example, say publication has initially table t1 with
rowfilter r1 for which we have cached the state. Now you altered
publication and added table t2, it will invalidate the entire state of
t1 as well. I think we can avoid that if we invalidate the rowfilter
related state only on relcache invalidation i.e in
rel_sync_cache_relation_cb and save it the very first time we prepare
the expression. In that case, we don't need to do it in advance when
preparing relsyncentry, this will have the additional advantage that
we won't spend cycles on preparing state unless it is required (for
truncate we won't require row_filtering, so it won't be prepared).
I have used debug logging to confirm that what Amit wrote [1]/messages/by-id/CAA4eK1+xQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg@mail.gmail.com is
correct; the row-filter ExprState of *every* table's row_filter will
be invalidated (and so subsequently gets rebuilt) when the user
changes the PUBLICATION tables. This was a side-effect of the
rel_sync_cache_publication_cb which is freeing the cached ExprState
and setting the entry->replicate_valid = false; for *every* entry.
So yes, the ExprCache is getting rebuilt for some situations where it
is not strictly necessary to do so.
But...
1. Although the ExprState cache is effective, in practice the
performance improvement was not very much. My previous results [2]/messages/by-id/CAHut+Ps3GgPKUJ2npfY4bQdxAmYW+yQin+hQuBsMYvX=kBqEpA@mail.gmail.com
showed only about 2sec saving for 100K calls to the
pgoutput_row_filter function. So I think eliminating just one or two
unnecessary calls in the get_rel_sync_entry is going to make zero
observable difference.
2. IMO it is safe to expect that the ALTER PUBLICATION is a rare
operation relative to the number of times that pgoutput_row_filter
will be called (the pgoutput_row_filter is quite a "hot" function
since it is called for every INSERT/UPDATE/DELETE). It will be orders
of magnitude difference 1:1000, 1:100000 etc.
~~
Anyway, I have implemented the suggested cache change because I agree
it is probably theoretically superior, even if in practice there is
almost no difference.
PSA 2 new patches (v24*)
Summary:
1. Now the rfnode_list row-filter cache is built 1 time only in
function get_rel_sync_entry.
2. Now the ExprState list cache is lazy-built 1 time only when first
needed in function pgoutput_row_filter
3. Now those caches are invalidated in function
rel_sync_cache_relation_cb; Invalidation of one relation's caches will
no longer cause the other relations' row-filter caches to be re-built.
------
I also ran performance tests to compare the old/new ExprState caching.
These tests are inserting 1 million rows using different percentages
of row filtering.
Please refer to the attached result data/results.
The main takeaway points from the test results are:
1. Using row-filter ExprState caching is slightly better than having
no ExprState caching.
2. The old/new style ExprState caches have approximately the same
performance. Essentially the *only* runtime difference with the
old/new cache is the added condition in the pgouput_row_filter to
check if the ExprState cache needs to be lazy-built or not. Over a
million rows maybe this extra condition accounts for a tiny difference
or maybe the small before/after differences can be attributed just to
natural runtime variations.
------
[1]: /messages/by-id/CAA4eK1+xQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg@mail.gmail.com
[2]: /messages/by-id/CAHut+Ps3GgPKUJ2npfY4bQdxAmYW+yQin+hQuBsMYvX=kBqEpA@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v24-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v24-0001-Row-filter-for-logical-replication.patchDownload
From 99f7073c06a268c4e96403d234190d8ce4c76ff7 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 24 Aug 2021 15:49:51 +1000
Subject: [PATCH v24] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++++-
src/backend/commands/publicationcmds.c | 112 +++++++----
src/backend/nodes/copyfuncs.c | 14 ++
src/backend/nodes/equalfuncs.c | 12 ++
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 298 +++++++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/024_row_filter.pl | 298 ++++++++++++++++++++++++++++
26 files changed, 1091 insertions(+), 75 deletions(-)
create mode 100644 src/test/subscription/t/024_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a..d473af1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..35006d9 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03..6057fc3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb..4709a71 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +556,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +596,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +625,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 38251c2..291c01b 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4828,6 +4828,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5692,6 +5703,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a17620..3ec66c4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2315,6 +2315,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
}
static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
+static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
{
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849..96c42d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9620,7 +9620,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9659,7 +9659,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9677,6 +9677,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..48bdbc3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..8d3a6b5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,9 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *rfnode_list; /* Row filters */
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +149,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +158,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +640,172 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /* Bail out if there is no row filter */
+ if (entry->rfnode_list == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the ExprState cache list is NIL then it means it is not yet built for this
+ * relation, so build it on-the-fly now.
+ */
+ if (entry->exprstate == NIL)
+ {
+ MemoryContext oldctx;
+
+ foreach(lc, entry->rfnode_list)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ Assert(entry->exprstate != NIL);
+ }
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +832,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +856,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +863,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +896,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +930,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +999,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1318,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1344,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->rfnode_list = NIL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1358,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1372,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1397,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1459,34 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if not already done and if available.
+ *
+ * All publication-table mappings must be checked. If it is a
+ * partition and pubviaroot is true, use the row filter of the
+ * topmost partitioned table instead of the row filter of its own
+ * partition.
+ */
+ if (entry->rfnode_list == NIL)
+ {
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->rfnode_list = lappend(entry->rfnode_list, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
}
list_free(pubids);
@@ -1339,6 +1593,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Free the row filter caches. They will be rebuilt later if needed.
+ */
+ if (entry->rfnode_list != NIL)
+ {
+ list_free_deep(entry->rfnode_list);
+ entry->rfnode_list = NIL;
+ }
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
+
}
}
@@ -1350,6 +1619,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1629,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1648,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ /*
+ * Note - we are not cleaning up any row filter cache here. We only want to
+ * affect that cache for relations that have changed.
+ * See rel_sync_cache_relation_cb.
+ */
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 90ac445..f4f1298 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4148,6 +4148,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4158,9 +4159,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4169,6 +4177,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4209,6 +4218,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4241,8 +4254,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e..0c31005 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -627,6 +627,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8333558..97250d2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 6a4d82f..56d13ff 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -490,6 +490,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7af13de..875b809 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3625,12 +3625,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3643,7 +3650,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0b..319c6bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/024_row_filter.pl b/src/test/subscription/t/024_row_filter.pl
new file mode 100644
index 0000000..ca8153e
--- /dev/null
+++ b/src/test/subscription/t/024_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v24-0002-TEMP-extra-logging-for-cache-debugging.patchapplication/octet-stream; name=v24-0002-TEMP-extra-logging-for-cache-debugging.patchDownload
From eb6c97f2120ba3ede48ff525e8e7239134814e11 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 24 Aug 2021 15:51:11 +1000
Subject: [PATCH v24] TEMP - extra logging for cache debugging.
This patch just injects some extra logging into the code which is helpful
to see what is going on for debugging the row filter caching.
None of this patch is destined to be committed.
---
src/backend/replication/pgoutput/pgoutput.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 8d3a6b5..f2beaf0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -766,11 +766,15 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
{
MemoryContext oldctx;
+ elog(LOG, "!!> pgoutput_row_filter: (re)build the ExprState cache for relid=%d", relid);
+
foreach(lc, entry->rfnode_list)
{
Node *rfnode = (Node *) lfirst(lc);
ExprState *exprstate;
+ elog(LOG, "!!> pgoutput_row_filter: build the ExprState on-the-fly");
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
exprstate = pgoutput_row_filter_init_expr(rfnode);
entry->exprstate = lappend(entry->exprstate, exprstate);
@@ -789,6 +793,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
{
ExprState *exprstate = (ExprState *) lfirst(lc);
+ elog(LOG, "!!> pgoutput_row_filter: using cached ExprState for relid=%d", relid);
+
/* Evaluates row filter */
result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
@@ -1255,6 +1261,8 @@ init_rel_sync_cache(MemoryContext cachectx)
{
HASHCTL ctl;
+ elog(LOG, "!!> HELLO init_rel_sync_cache");
+
if (RelationSyncCache != NULL)
return;
@@ -1329,6 +1337,8 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
Assert(RelationSyncCache != NULL);
+ elog(LOG, "!!> HELLO get_rel_sync_entry for relid=%d", relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -1482,6 +1492,7 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
rfnode = stringToNode(TextDatumGetCString(rfdatum));
entry->rfnode_list = lappend(entry->rfnode_list, rfnode);
MemoryContextSwitchTo(oldctx);
+ elog(LOG, "!!> get_rel_sync_entry: caching the rfnode_list for relid=%d", relid);
}
ReleaseSysCache(rftuple);
@@ -1516,6 +1527,8 @@ cleanup_rel_sync_cache(TransactionId xid, bool is_commit)
RelationSyncEntry *entry;
ListCell *lc;
+ elog(LOG, "!!> HELLO cleanup_rel_sync_cache");
+
Assert(RelationSyncCache != NULL);
hash_seq_init(&hash_seq, RelationSyncCache);
@@ -1550,6 +1563,8 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ elog(LOG, "!!> HELLO rel_sync_cache_relation_cb relid=%d", relid);
+
/*
* We can get here if the plugin was used in SQL interface as the
* RelSchemaSyncCache is destroyed when the decoding finishes, but there
@@ -1579,6 +1594,9 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
*/
if (entry != NULL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: rfnode_list %s", entry->rfnode_list == NIL ? "NIL" : "not NIL");
+ elog(LOG, "!!> rel_sync_cache_relation_cb: exprstate %s", entry->exprstate == NIL ? "NIL" : "not NIL");
+
entry->schema_sent = false;
list_free(entry->streamed_txns);
entry->streamed_txns = NIL;
@@ -1599,11 +1617,13 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
*/
if (entry->rfnode_list != NIL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: cleanup rfnode_list cache for relid=%d", relid);
list_free_deep(entry->rfnode_list);
entry->rfnode_list = NIL;
}
if (entry->exprstate != NIL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: cleanup ExprState cache for relid=%d", relid);
list_free_deep(entry->exprstate);
entry->exprstate = NIL;
}
--
1.8.3.1
On Tue, Aug 24, 2021, at 4:46 AM, Peter Smith wrote:
I have used debug logging to confirm that what Amit wrote [1] is
correct; the row-filter ExprState of *every* table's row_filter will
be invalidated (and so subsequently gets rebuilt) when the user
changes the PUBLICATION tables. This was a side-effect of the
rel_sync_cache_publication_cb which is freeing the cached ExprState
and setting the entry->replicate_valid = false; for *every* entry.So yes, the ExprCache is getting rebuilt for some situations where it
is not strictly necessary to do so.
I'm afraid we are overenginnering this feature. We already have a cache
mechanism that was suggested (that shows a small improvement). As you said the
gain for this new improvement is zero or minimal (it depends on your logical
replication setup/maintenance).
1. Although the ExprState cache is effective, in practice the
performance improvement was not very much. My previous results [2]
showed only about 2sec saving for 100K calls to the
pgoutput_row_filter function. So I think eliminating just one or two
unnecessary calls in the get_rel_sync_entry is going to make zero
observable difference.2. IMO it is safe to expect that the ALTER PUBLICATION is a rare
operation relative to the number of times that pgoutput_row_filter
will be called (the pgoutput_row_filter is quite a "hot" function
since it is called for every INSERT/UPDATE/DELETE). It will be orders
of magnitude difference 1:1000, 1:100000 etc.~~
Anyway, I have implemented the suggested cache change because I agree
it is probably theoretically superior, even if in practice there is
almost no difference.
I didn't inspect your patch carefully but it seems you add another List to
control this new cache mechanism. I don't like it. IMO if we can use the data
structures that we have now, let's implement your idea; otherwise, -1 for this
new micro optimization.
[By the way, it took some time to extract what you changed. Since we're trading
patches, I personally appreciate if you can send a patch on the top of the
current one. I have some changes too and it is time consuming incorporating
changes in the main patch.]
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Aug 25, 2021 at 5:52 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Aug 24, 2021, at 4:46 AM, Peter Smith wrote:
I have used debug logging to confirm that what Amit wrote [1] is
correct; the row-filter ExprState of *every* table's row_filter will
be invalidated (and so subsequently gets rebuilt) when the user
changes the PUBLICATION tables. This was a side-effect of the
rel_sync_cache_publication_cb which is freeing the cached ExprState
and setting the entry->replicate_valid = false; for *every* entry.So yes, the ExprCache is getting rebuilt for some situations where it
is not strictly necessary to do so.I'm afraid we are overenginnering this feature. We already have a cache
mechanism that was suggested (that shows a small improvement). As you said the
gain for this new improvement is zero or minimal (it depends on your logical
replication setup/maintenance).
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.
1. Although the ExprState cache is effective, in practice the
performance improvement was not very much. My previous results [2]
showed only about 2sec saving for 100K calls to the
pgoutput_row_filter function. So I think eliminating just one or two
unnecessary calls in the get_rel_sync_entry is going to make zero
observable difference.2. IMO it is safe to expect that the ALTER PUBLICATION is a rare
operation relative to the number of times that pgoutput_row_filter
will be called (the pgoutput_row_filter is quite a "hot" function
since it is called for every INSERT/UPDATE/DELETE). It will be orders
of magnitude difference 1:1000, 1:100000 etc.~~
Anyway, I have implemented the suggested cache change because I agree
it is probably theoretically superior, even if in practice there is
almost no difference.I didn't inspect your patch carefully but it seems you add another List to
control this new cache mechanism. I don't like it. IMO if we can use the data
structures that we have now, let's implement your idea; otherwise, -1 for this
new micro optimization.
As mentioned above, without this we will invalidate many cached
expressions even though it is not required. I don't deny that there
might be a better way to achieve the same and if you or Peter have any
ideas, I am all ears. If there are technical challenges to achieve the
same or it makes the patch complex then certainly we can discuss but
according to me, this should not introduce additional complexity.
[By the way, it took some time to extract what you changed. Since we're trading
patches, I personally appreciate if you can send a patch on the top of the
current one.
+1.
--
With Regards,
Amit Kapila.
On Wed, Aug 25, 2021 at 10:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 25, 2021 at 5:52 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Aug 24, 2021, at 4:46 AM, Peter Smith wrote:
Anyway, I have implemented the suggested cache change because I agree
it is probably theoretically superior, even if in practice there is
almost no difference.I didn't inspect your patch carefully but it seems you add another List to
control this new cache mechanism. I don't like it. IMO if we can use the data
structures that we have now, let's implement your idea; otherwise, -1 for this
new micro optimization.As mentioned above, without this we will invalidate many cached
expressions even though it is not required. I don't deny that there
might be a better way to achieve the same and if you or Peter have any
ideas, I am all ears.
I see that the new list is added to store row_filter node which we
later use to compute expression. This is not required for invalidation
but for delaying the expression evaluation till it is required (for
example, for truncate, we may not need the row evaluation, so there is
no need to compute it). Can we try to postpone the syscache lookup to
a later stage when we are actually doing row_filtering? If we can do
that, then I think we can avoid having this extra list?
--
With Regards,
Amit Kapila.
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.
Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.
On Wed, Aug 25, 2021 at 6:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 25, 2021 at 10:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 25, 2021 at 5:52 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Aug 24, 2021, at 4:46 AM, Peter Smith wrote:
Anyway, I have implemented the suggested cache change because I agree
it is probably theoretically superior, even if in practice there is
almost no difference.I didn't inspect your patch carefully but it seems you add another List to
control this new cache mechanism. I don't like it. IMO if we can use the data
structures that we have now, let's implement your idea; otherwise, -1 for this
new micro optimization.As mentioned above, without this we will invalidate many cached
expressions even though it is not required. I don't deny that there
might be a better way to achieve the same and if you or Peter have any
ideas, I am all ears.I see that the new list is added to store row_filter node which we
later use to compute expression. This is not required for invalidation
but for delaying the expression evaluation till it is required (for
example, for truncate, we may not need the row evaluation, so there is
no need to compute it). Can we try to postpone the syscache lookup to
a later stage when we are actually doing row_filtering? If we can do
that, then I think we can avoid having this extra list?
Yes, you are correct - that Node list was re-instated only because you
had requested that the ExprState evaluation should be deferred until
it is needed by the pgoutput_row_filter. Otherwise, the additional
list would not be needed so everything would be much the same as in
v23 except the invalidations would be more focussed on single tables.
I don't think the syscache lookup can be easily postponed. That logic
of get_rel_sync_entry processes the table filters of *all*
publications, so moving that publications loop (including the
partition logic) into the pgoutput_row_filter seems a bridge too far
IMO.
Furthermore, I am not yet convinced that this ExprState postponement
is very useful. It may be true that for truncate there is no need to
compute it, but consider that the user would never even define a row
filter in the first place unless they intended there will be some CRUD
operations. So even if the truncate does not need the filter,
*something* is surely going to need it. In other words, IIUC this
postponement is not going to save any time overall - it only shifting
when the (one time) expression evaluation will happen.
I feel it would be better to just remove the postponed evaluation of
the ExprState added in v24. That will remove any need for the extra
Node list (which I think is Euler's concern). The ExprState cache will
still be slightly improved from how it was implemented before because
it is "logically correct" that we don't invalidate the cached
expressions unless required.
Thoughts?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.
I think this means you didn't even understand or are convinced why the
patch has cache in the first place. As per your theory, even if we
didn't have cache, it won't matter but that is not true otherwise, the
patch wouldn't have it.
On Wed, Aug 25, 2021 at 6:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 25, 2021 at 10:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 25, 2021 at 5:52 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Aug 24, 2021, at 4:46 AM, Peter Smith wrote:
Anyway, I have implemented the suggested cache change because I agree
it is probably theoretically superior, even if in practice there is
almost no difference.I didn't inspect your patch carefully but it seems you add another List to
control this new cache mechanism. I don't like it. IMO if we can use the data
structures that we have now, let's implement your idea; otherwise, -1 for this
new micro optimization.As mentioned above, without this we will invalidate many cached
expressions even though it is not required. I don't deny that there
might be a better way to achieve the same and if you or Peter have any
ideas, I am all ears.I see that the new list is added to store row_filter node which we
later use to compute expression. This is not required for invalidation
but for delaying the expression evaluation till it is required (for
example, for truncate, we may not need the row evaluation, so there is
no need to compute it). Can we try to postpone the syscache lookup to
a later stage when we are actually doing row_filtering? If we can do
that, then I think we can avoid having this extra list?Yes, you are correct - that Node list was re-instated only because you
had requested that the ExprState evaluation should be deferred until
it is needed by the pgoutput_row_filter. Otherwise, the additional
list would not be needed so everything would be much the same as in
v23 except the invalidations would be more focussed on single tables.I don't think the syscache lookup can be easily postponed. That logic
of get_rel_sync_entry processes the table filters of *all*
publications, so moving that publications loop (including the
partition logic) into the pgoutput_row_filter seems a bridge too far
IMO.
Hmm, I don't think that is not true. You just need it for the relation
to be processed.
Furthermore, I am not yet convinced that this ExprState postponement
is very useful. It may be true that for truncate there is no need to
compute it, but consider that the user would never even define a row
filter in the first place unless they intended there will be some CRUD
operations. So even if the truncate does not need the filter,
*something* is surely going to need it.
Sure, but we don't need to add additional computation until it is required.
--
With Regards,
Amit Kapila.
On Thu, Aug 26, 2021 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.I think this means you didn't even understand or are convinced why the
patch has cache in the first place. As per your theory, even if we
didn't have cache, it won't matter but that is not true otherwise, the
patch wouldn't have it.
I have never said there should be no caching. On the contrary, my
performance test results [1]/messages/by-id/CAHut+Ps5j7mkO0xLmNW=kXh0eezGoKyzBCiQc9bfkCiM_MVDrg@mail.gmail.com already confirmed that caching ExprState
is of benefit for the millions of times it may be used in the
pgoutput_row_filter function. My only doubts are in regard to how much
observable impact there would be re-evaluating the filter expression
just a few extra times by the get_rel_sync_entry function.
------
[1]: /messages/by-id/CAHut+Ps5j7mkO0xLmNW=kXh0eezGoKyzBCiQc9bfkCiM_MVDrg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Aug 26, 2021 at 9:51 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.I think this means you didn't even understand or are convinced why the
patch has cache in the first place. As per your theory, even if we
didn't have cache, it won't matter but that is not true otherwise, the
patch wouldn't have it.I have never said there should be no caching. On the contrary, my
performance test results [1] already confirmed that caching ExprState
is of benefit for the millions of times it may be used in the
pgoutput_row_filter function. My only doubts are in regard to how much
observable impact there would be re-evaluating the filter expression
just a few extra times by the get_rel_sync_entry function.
I think it depends but why in the first place do you want to allow
re-evaluation when there is a way for not doing that?
--
With Regards,
Amit Kapila.
On Wed, Aug 25, 2021 at 10:22 AM Euler Taveira <euler@eulerto.com> wrote:
....
[By the way, it took some time to extract what you changed. Since we're trading
patches, I personally appreciate if you can send a patch on the top of the
current one. I have some changes too and it is time consuming incorporating
changes in the main patch.]
OK. Sorry for causing you trouble.
Here I am re-posting the ExprState cache changes as an incremental
patch on top of the last rebased row-filter patch (v23).
v25-0001 <--- v23 (last rebased main patch)
v25-0002 ExprState cache mods
v25-0002 ExprState cache extra debug logging (temp)
Hopefully, this will make it easier to deal with this change in isolation.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v25-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v25-0001-Row-filter-for-logical-replication.patchDownload
From cc2f27a9c428f27f267c14d6aa698ec6fb390ef4 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 26 Aug 2021 18:52:23 +1000
Subject: [PATCH v25] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++++-
src/backend/commands/publicationcmds.c | 112 +++++++----
src/backend/nodes/copyfuncs.c | 14 ++
src/backend/nodes/equalfuncs.c | 12 ++
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 255 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/024_row_filter.pl | 298 ++++++++++++++++++++++++++++
26 files changed, 1048 insertions(+), 75 deletions(-)
create mode 100644 src/test/subscription/t/024_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a..d473af1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..35006d9 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03..6057fc3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb..4709a71 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +556,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +596,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +625,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 38251c2..291c01b 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4828,6 +4828,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5692,6 +5703,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a17620..3ec66c4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2315,6 +2315,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
}
static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
+static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
{
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849..96c42d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9620,7 +9620,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9659,7 +9659,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9677,6 +9677,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..48bdbc3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..ef1ba91 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,13 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 90ac445..f4f1298 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4148,6 +4148,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4158,9 +4159,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4169,6 +4177,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4209,6 +4218,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4241,8 +4254,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e..0c31005 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -627,6 +627,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8333558..97250d2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 6a4d82f..56d13ff 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -490,6 +490,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7af13de..875b809 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3625,12 +3625,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3643,7 +3650,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0b..319c6bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/024_row_filter.pl b/src/test/subscription/t/024_row_filter.pl
new file mode 100644
index 0000000..ca8153e
--- /dev/null
+++ b/src/test/subscription/t/024_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v25-0002-ExprState-cache-modifications.patchapplication/octet-stream; name=v25-0002-ExprState-cache-modifications.patchDownload
From 62ab3cfa9036812384776795fa4da759e0d6c383 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 26 Aug 2021 19:12:16 +1000
Subject: [PATCH v25] ExprState cache modifications.
Now the cached Node/ExprState lists are invalidated only in rel_sync_cache_relation_cb function,
so the ALTER PUBLICATION for one table should not cause row-filters of other tables to also
become invalidated.
Now the ExprState list evaluation is deferred until first needed in pgouput_row_filter function.
Changes are based on a suggestion from Amit [1].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 89 ++++++++++++++++++++---------
1 file changed, 63 insertions(+), 26 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ef1ba91..bb6f608 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,6 +123,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *rfnode_list; /* Row filters */
List *exprstate; /* ExprState for row filter */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
@@ -737,14 +738,15 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->rfnode_list == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -757,6 +759,28 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
/*
+ * If the ExprState cache list is NIL then it means it is not yet built for this
+ * relation, so build it on-the-fly now.
+ */
+ if (entry->exprstate == NIL)
+ {
+ MemoryContext oldctx;
+
+ foreach(lc, entry->rfnode_list)
+ {
+ Node *rfnode = (Node *) lfirst(lc);
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ Assert(entry->exprstate != NIL);
+ }
+
+ /*
* If the subscription has multiple publications and the same table has a
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
@@ -1321,6 +1345,7 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->rfnode_list = NIL;
entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
@@ -1435,31 +1460,32 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
}
/*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
+ * Cache row filter, if not already done and if available.
+ *
+ * All publication-table mappings must be checked. If it is a
+ * partition and pubviaroot is true, use the row filter of the
+ * topmost partitioned table instead of the row filter of its own
+ * partition.
*/
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
+ if (entry->rfnode_list == NIL)
{
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
{
- Node *rfnode;
- ExprState *exprstate;
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ if (!rfisnull)
+ {
+ Node *rfnode;
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ entry->rfnode_list = lappend(entry->rfnode_list, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
- ReleaseSysCache(rftuple);
+ ReleaseSysCache(rftuple);
+ }
}
}
@@ -1567,6 +1593,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Free the row filter caches. They will be rebuilt later if needed.
+ */
+ if (entry->rfnode_list != NIL)
+ {
+ list_free_deep(entry->rfnode_list);
+ entry->rfnode_list = NIL;
+ entry->replicate_valid = false; /* Clear flag so get_rel_syn_entry with rebuild cache. */
+ }
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
}
@@ -1607,10 +1648,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v25-0003-ExprState-cache-temp-extra-logging.patchapplication/octet-stream; name=v25-0003-ExprState-cache-temp-extra-logging.patchDownload
From dbe57b35278772dc1faf10fa7a5cac66c3085807 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 26 Aug 2021 19:25:06 +1000
Subject: [PATCH v25] ExprState cache temp extra logging.
This patch just injects some extra logging into the code which is helpful
to see what is going on when debugging the row filter caching.
Temporary. Not to be committed.
---
src/backend/replication/pgoutput/pgoutput.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index bb6f608..de6d1e9 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -766,11 +766,15 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
{
MemoryContext oldctx;
+ elog(LOG, "!!> pgoutput_row_filter: (re)build the ExprState cache for relid=%d", relid);
+
foreach(lc, entry->rfnode_list)
{
Node *rfnode = (Node *) lfirst(lc);
ExprState *exprstate;
+ elog(LOG, "!!> pgoutput_row_filter: build the ExprState on-the-fly");
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
exprstate = pgoutput_row_filter_init_expr(rfnode);
entry->exprstate = lappend(entry->exprstate, exprstate);
@@ -789,6 +793,8 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
{
ExprState *exprstate = (ExprState *) lfirst(lc);
+ elog(LOG, "!!> pgoutput_row_filter: using cached ExprState for relid=%d", relid);
+
/* Evaluates row filter */
result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
@@ -1255,6 +1261,8 @@ init_rel_sync_cache(MemoryContext cachectx)
{
HASHCTL ctl;
+ elog(LOG, "!!> HELLO init_rel_sync_cache");
+
if (RelationSyncCache != NULL)
return;
@@ -1329,6 +1337,8 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
Assert(RelationSyncCache != NULL);
+ elog(LOG, "!!> HELLO get_rel_sync_entry for relid=%d", relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -1482,6 +1492,7 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
rfnode = stringToNode(TextDatumGetCString(rfdatum));
entry->rfnode_list = lappend(entry->rfnode_list, rfnode);
MemoryContextSwitchTo(oldctx);
+ elog(LOG, "!!> get_rel_sync_entry: caching the rfnode_list for relid=%d", relid);
}
ReleaseSysCache(rftuple);
@@ -1516,6 +1527,8 @@ cleanup_rel_sync_cache(TransactionId xid, bool is_commit)
RelationSyncEntry *entry;
ListCell *lc;
+ elog(LOG, "!!> HELLO cleanup_rel_sync_cache");
+
Assert(RelationSyncCache != NULL);
hash_seq_init(&hash_seq, RelationSyncCache);
@@ -1550,6 +1563,8 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ elog(LOG, "!!> HELLO rel_sync_cache_relation_cb relid=%d", relid);
+
/*
* We can get here if the plugin was used in SQL interface as the
* RelSchemaSyncCache is destroyed when the decoding finishes, but there
@@ -1579,6 +1594,9 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
*/
if (entry != NULL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: rfnode_list %s", entry->rfnode_list == NIL ? "NIL" : "not NIL");
+ elog(LOG, "!!> rel_sync_cache_relation_cb: exprstate %s", entry->exprstate == NIL ? "NIL" : "not NIL");
+
entry->schema_sent = false;
list_free(entry->streamed_txns);
entry->streamed_txns = NIL;
@@ -1599,12 +1617,14 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
*/
if (entry->rfnode_list != NIL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: cleanup rfnode_list cache for relid=%d", relid);
list_free_deep(entry->rfnode_list);
entry->rfnode_list = NIL;
entry->replicate_valid = false; /* Clear flag so get_rel_syn_entry with rebuild cache. */
}
if (entry->exprstate != NIL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: cleanup ExprState cache for relid=%d", relid);
list_free_deep(entry->exprstate);
entry->exprstate = NIL;
}
@@ -1621,6 +1641,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
RelationSyncEntry *entry;
MemoryContext oldctx;
+ elog(LOG, "!!> HELLO rel_sync_cache_publication_cb");
+
/*
* We can get here if the plugin was used in SQL interface as the
* RelSchemaSyncCache is destroyed when the decoding finishes, but there
--
1.8.3.1
On Thu, Aug 26, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:51 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.I think this means you didn't even understand or are convinced why the
patch has cache in the first place. As per your theory, even if we
didn't have cache, it won't matter but that is not true otherwise, the
patch wouldn't have it.I have never said there should be no caching. On the contrary, my
performance test results [1] already confirmed that caching ExprState
is of benefit for the millions of times it may be used in the
pgoutput_row_filter function. My only doubts are in regard to how much
observable impact there would be re-evaluating the filter expression
just a few extra times by the get_rel_sync_entry function.I think it depends but why in the first place do you want to allow
re-evaluation when there is a way for not doing that?
Because the current code logic of having the "delayed" ExprState
evaluation does come at some cost. And the cost is -
a. Needing an extra condition and more code in the function pgoutput_row_filter
b. Needing to maintain the additional Node list
If we chose not to implement a delayed ExprState cache evaluation then
there would still be a (one-time) ExprState cache evaluation but it
would happen whenever get_rel_sync_entry is called (regardless of if
pgoputput_row_filter is subsequently called). E.g. there can be some
rebuilds of the ExprState cache if the user calls TRUNCATE.
I guess I felt the only justification for implementing more
sophisticated cache logic is if gives a performance gain. But if there
is no observable difference, then maybe it's better to just keep the
code simpler. That is why I have been questioning how much time a
one-time ExprState cache evaluation really takes, and would a few
extra ones even be noticeable.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Thu, Aug 26, 2021 at 3:41 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:51 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.I think this means you didn't even understand or are convinced why the
patch has cache in the first place. As per your theory, even if we
didn't have cache, it won't matter but that is not true otherwise, the
patch wouldn't have it.I have never said there should be no caching. On the contrary, my
performance test results [1] already confirmed that caching ExprState
is of benefit for the millions of times it may be used in the
pgoutput_row_filter function. My only doubts are in regard to how much
observable impact there would be re-evaluating the filter expression
just a few extra times by the get_rel_sync_entry function.I think it depends but why in the first place do you want to allow
re-evaluation when there is a way for not doing that?Because the current code logic of having the "delayed" ExprState
evaluation does come at some cost.
So, now you mixed it with the second point. Here, I was talking about
the need for correct invalidation but you started discussing when to
first time evaluate the expression, both are different things.
And the cost is -
a. Needing an extra condition and more code in the function pgoutput_row_filter
b. Needing to maintain the additional Node list
I am not sure you need (b) above and I think (a) should make the
overall code look clean.
If we chose not to implement a delayed ExprState cache evaluation then
there would still be a (one-time) ExprState cache evaluation but it
would happen whenever get_rel_sync_entry is called (regardless of if
pgoputput_row_filter is subsequently called). E.g. there can be some
rebuilds of the ExprState cache if the user calls TRUNCATE.
Apart from Truncate, it will also be a waste if any error happens
before actually evaluating the filter, tomorrow there could be other
operations like replication of sequences (I have checked that proposed
patch for sequences uses get_rel_sync_entry) where we don't need to
build ExprState (as filters might or might not be there). So, it would
be better to avoid cache lookups in those cases if possible. I still
think doing expensive things like preparing expressions should ideally
be done only when it is required.
--
With Regards,
Amit Kapila.
On Thu, Aug 26, 2021 at 9:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:41 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:51 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.I think this means you didn't even understand or are convinced why the
patch has cache in the first place. As per your theory, even if we
didn't have cache, it won't matter but that is not true otherwise, the
patch wouldn't have it.I have never said there should be no caching. On the contrary, my
performance test results [1] already confirmed that caching ExprState
is of benefit for the millions of times it may be used in the
pgoutput_row_filter function. My only doubts are in regard to how much
observable impact there would be re-evaluating the filter expression
just a few extra times by the get_rel_sync_entry function.I think it depends but why in the first place do you want to allow
re-evaluation when there is a way for not doing that?Because the current code logic of having the "delayed" ExprState
evaluation does come at some cost.So, now you mixed it with the second point. Here, I was talking about
the need for correct invalidation but you started discussing when to
first time evaluate the expression, both are different things.And the cost is -
a. Needing an extra condition and more code in the function pgoutput_row_filter
b. Needing to maintain the additional Node listI am not sure you need (b) above and I think (a) should make the
overall code look clean.If we chose not to implement a delayed ExprState cache evaluation then
there would still be a (one-time) ExprState cache evaluation but it
would happen whenever get_rel_sync_entry is called (regardless of if
pgoputput_row_filter is subsequently called). E.g. there can be some
rebuilds of the ExprState cache if the user calls TRUNCATE.Apart from Truncate, it will also be a waste if any error happens
before actually evaluating the filter, tomorrow there could be other
operations like replication of sequences (I have checked that proposed
patch for sequences uses get_rel_sync_entry) where we don't need to
build ExprState (as filters might or might not be there). So, it would
be better to avoid cache lookups in those cases if possible. I still
think doing expensive things like preparing expressions should ideally
be done only when it is required.
OK. Per your suggestion, I will try to move as much of the row-filter
cache code as possible out of the get_rel_sync_entry function and into
the pgoutput_row_filter function.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Aug 27, 2021 at 3:31 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:41 PM Peter Smith <smithpb2250@gmail.com> wrote:
Apart from Truncate, it will also be a waste if any error happens
before actually evaluating the filter, tomorrow there could be other
operations like replication of sequences (I have checked that proposed
patch for sequences uses get_rel_sync_entry) where we don't need to
build ExprState (as filters might or might not be there). So, it would
be better to avoid cache lookups in those cases if possible. I still
think doing expensive things like preparing expressions should ideally
be done only when it is required.OK. Per your suggestion, I will try to move as much of the row-filter
cache code as possible out of the get_rel_sync_entry function and into
the pgoutput_row_filter function.
I could think of more scenarios where doing this work in
get_rel_sync_entry() could cost us without any actual need for it.
Consider, the user has published only 'update' and 'delete' operation
for a publication, then in the system there are inserts followed
truncate or any ddl which generates invalidation, for such a case, for
each change we need to rebuild the row_filters but we won't use it.
Similarly, this can happen in any other combination of DML and DDL
operations where the DML operation is not published. I don't want to
say that this is the most common scenario but it is important to do
expensive work when it is actually required, otherwise, there could be
cases where it might hit us.
--
With Regards,
Amit Kapila.
On Fri, Aug 27, 2021 at 8:01 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:41 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:51 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Aug 26, 2021 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Hmm, I think the gain via caching is not visible because we are using
simple expressions. It will be visible when we use somewhat complex
expressions where expression evaluation cost is significant.
Similarly, the impact of this change will magnify and it will also be
visible when a publication has many tables. Apart from performance,
this change is logically correct as well because it would be any way
better if we don't invalidate the cached expressions unless required.Please tell me what is your idea of a "complex" row filter expression.
Do you just mean a filter that has multiple AND conditions in it? I
don't really know if few complex expressions would amount to any
significant evaluation costs, so I would like to run some timing tests
with some real examples to see the results.I think this means you didn't even understand or are convinced why the
patch has cache in the first place. As per your theory, even if we
didn't have cache, it won't matter but that is not true otherwise, the
patch wouldn't have it.I have never said there should be no caching. On the contrary, my
performance test results [1] already confirmed that caching ExprState
is of benefit for the millions of times it may be used in the
pgoutput_row_filter function. My only doubts are in regard to how much
observable impact there would be re-evaluating the filter expression
just a few extra times by the get_rel_sync_entry function.I think it depends but why in the first place do you want to allow
re-evaluation when there is a way for not doing that?Because the current code logic of having the "delayed" ExprState
evaluation does come at some cost.So, now you mixed it with the second point. Here, I was talking about
the need for correct invalidation but you started discussing when to
first time evaluate the expression, both are different things.And the cost is -
a. Needing an extra condition and more code in the function pgoutput_row_filter
b. Needing to maintain the additional Node listI am not sure you need (b) above and I think (a) should make the
overall code look clean.If we chose not to implement a delayed ExprState cache evaluation then
there would still be a (one-time) ExprState cache evaluation but it
would happen whenever get_rel_sync_entry is called (regardless of if
pgoputput_row_filter is subsequently called). E.g. there can be some
rebuilds of the ExprState cache if the user calls TRUNCATE.Apart from Truncate, it will also be a waste if any error happens
before actually evaluating the filter, tomorrow there could be other
operations like replication of sequences (I have checked that proposed
patch for sequences uses get_rel_sync_entry) where we don't need to
build ExprState (as filters might or might not be there). So, it would
be better to avoid cache lookups in those cases if possible. I still
think doing expensive things like preparing expressions should ideally
be done only when it is required.OK. Per your suggestion, I will try to move as much of the row-filter
cache code as possible out of the get_rel_sync_entry function and into
the pgoutput_row_filter function.
Here are the new v26* patches. This is a refactoring of the row-filter
caches to remove all the logic from the get_rel_sync_entry function
and delay it until if/when needed in the pgoutput_row_filter function.
This is now implemented per Amit's suggestion to move all the cache
code [1]/messages/by-id/CAA4eK1+tio46goUKBUfAKFsFVxtgk8nOty=TxKoKH-gdLzHD2g@mail.gmail.com. It is a replacement for the v25* patches.
The make check and TAP subscription tests are all OK. I have repeated
the performance tests [2]/messages/by-id/CAHut+Ps5j7mkO0xLmNW=kXh0eezGoKyzBCiQc9bfkCiM_MVDrg@mail.gmail.com and those results are good too.
v26-0001 <--- v23 (base RF patch)
v26-0002 <--- ExprState cache mods (refactored row filter caching)
v26-0002 <--- ExprState cache extra debug logging (temp)
------
[1]: /messages/by-id/CAA4eK1+tio46goUKBUfAKFsFVxtgk8nOty=TxKoKH-gdLzHD2g@mail.gmail.com
[2]: /messages/by-id/CAHut+Ps5j7mkO0xLmNW=kXh0eezGoKyzBCiQc9bfkCiM_MVDrg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia.
Attachments:
v26-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v26-0001-Row-filter-for-logical-replication.patchDownload
From 8ff1966fef8fc111b81f21ee186ff0ca982ce9bd Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 27 Aug 2021 10:11:36 +1000
Subject: [PATCH v26] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++++-
src/backend/commands/publicationcmds.c | 112 +++++++----
src/backend/nodes/copyfuncs.c | 14 ++
src/backend/nodes/equalfuncs.c | 12 ++
src/backend/parser/gram.y | 24 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 13 ++
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 255 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/024_row_filter.pl | 298 ++++++++++++++++++++++++++++
26 files changed, 1048 insertions(+), 75 deletions(-)
create mode 100644 src/test/subscription/t/024_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a..d473af1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..35006d9 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions and user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03..6057fc3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb..4709a71 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ PublicationRelationInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,41 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +556,14 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +596,15 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pri->whereClause = t->whereClause;
+ else
+ pri->whereClause = NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +625,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +646,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +678,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +691,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 38251c2..291c01b 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4828,6 +4828,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5692,6 +5703,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a17620..3ec66c4 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2315,6 +2315,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
}
static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
+static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
{
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849..96c42d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9620,7 +9620,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9659,7 +9659,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9677,6 +9677,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..fc4170e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -119,6 +119,13 @@ transformExprRecurse(ParseState *pstate, Node *expr)
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
+ /* Functions are not allowed in publication WHERE clauses */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && nodeTag(expr) == T_FuncCall)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
switch (nodeTag(expr))
{
case T_ColumnRef:
@@ -509,6 +516,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1777,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3100,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..48bdbc3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfo(&cmd, "%s", q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..ef1ba91 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,13 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 90ac445..f4f1298 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4148,6 +4148,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4158,9 +4159,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4169,6 +4177,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4209,6 +4218,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4241,8 +4254,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e..0c31005 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -627,6 +627,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8333558..97250d2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBuffer(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad..2703b9c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 6a4d82f..56d13ff 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -490,6 +490,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7af13de..875b809 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3625,12 +3625,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3643,7 +3650,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0b..319c6bc 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075..b1606cc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/024_row_filter.pl b/src/test/subscription/t/024_row_filter.pl
new file mode 100644
index 0000000..ca8153e
--- /dev/null
+++ b/src/test/subscription/t/024_row_filter.pl
@@ -0,0 +1,298 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v26-0003-ExprState-cache-temp-extra-logging.patchapplication/octet-stream; name=v26-0003-ExprState-cache-temp-extra-logging.patchDownload
From cf26b61bb26e994cac16f9b2e033d9a2a12cae53 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 30 Aug 2021 11:59:00 +1000
Subject: [PATCH v26] ExprState cache temp extra logging.
This patch just injects some extra logging into the code which is helpful
to see what is going on when debugging the row filter caching.
Temporary. Not to be committed.
---
src/backend/replication/pgoutput/pgoutput.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce5e1c5..d89dcb0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -759,6 +759,8 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
MemoryContext oldctx;
TupleDesc tupdesc = RelationGetDescr(relation);
+ elog(LOG, "!!> pgoutput_row_filter: Look for row filters of relid=%d", relid);
+
/*
* Create a tuple table slot for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
@@ -832,11 +834,13 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
Node *rfnode;
ExprState *exprstate;
+ elog(LOG, "!!> pgoutput_row_filter: (re)build the ExprState cache for relid=%d", pub_relid);
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
rfnode = stringToNode(TextDatumGetCString(rfdatum));
exprstate = pgoutput_row_filter_init_expr(rfnode);
entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
MemoryContextSwitchTo(oldctx);
+ elog(LOG, "!!> pgoutput_row_filter: build the ExprState on-the-fly");
}
ReleaseSysCache(rftuple);
@@ -874,6 +878,8 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
{
ExprState *exprstate = (ExprState *) lfirst(lc);
+ elog(LOG, "!!> pgoutput_row_filter: using cached ExprState for relid=%d", relid);
+
/* Evaluates row filter */
result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
@@ -1340,6 +1346,8 @@ init_rel_sync_cache(MemoryContext cachectx)
{
HASHCTL ctl;
+ elog(LOG, "!!> HELLO init_rel_sync_cache");
+
if (RelationSyncCache != NULL)
return;
@@ -1414,6 +1422,8 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
Assert(RelationSyncCache != NULL);
+ elog(LOG, "!!> HELLO get_rel_sync_entry for relid=%d", relid);
+
/* Find cached relation info, creating if not found */
entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
(void *) &relid,
@@ -1553,6 +1563,8 @@ cleanup_rel_sync_cache(TransactionId xid, bool is_commit)
RelationSyncEntry *entry;
ListCell *lc;
+ elog(LOG, "!!> HELLO cleanup_rel_sync_cache");
+
Assert(RelationSyncCache != NULL);
hash_seq_init(&hash_seq, RelationSyncCache);
@@ -1587,6 +1599,8 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ elog(LOG, "!!> HELLO rel_sync_cache_relation_cb relid=%d", relid);
+
/*
* We can get here if the plugin was used in SQL interface as the
* RelSchemaSyncCache is destroyed when the decoding finishes, but there
@@ -1616,6 +1630,8 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
*/
if (entry != NULL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: exprstate_list %s", entry->exprstate_list == NIL ? "NIL" : "not NIL");
+
entry->schema_sent = false;
list_free(entry->streamed_txns);
entry->streamed_txns = NIL;
@@ -1642,6 +1658,7 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
}
if (entry->exprstate_list != NIL)
{
+ elog(LOG, "!!> rel_sync_cache_relation_cb: cleanup ExprState cache for relid=%d", relid);
list_free_deep(entry->exprstate_list);
entry->exprstate_list = NIL;
}
@@ -1658,6 +1675,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
RelationSyncEntry *entry;
MemoryContext oldctx;
+ elog(LOG, "!!> HELLO rel_sync_cache_publication_cb");
+
/*
* We can get here if the plugin was used in SQL interface as the
* RelSchemaSyncCache is destroyed when the decoding finishes, but there
--
1.8.3.1
v26-0002-ExprState-cache-modifications.patchapplication/octet-stream; name=v26-0002-ExprState-cache-modifications.patchDownload
From f83101836af208771b712d0e485d3d081579c97e Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 30 Aug 2021 11:26:26 +1000
Subject: [PATCH v26] ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 198 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 62 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ef1ba91..ce5e1c5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means that exprstate_list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache then in the
+ * entry->exprstate_list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstate_list == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstate_list)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1318,10 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1333,7 +1443,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1347,22 +1456,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1372,9 +1465,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1434,33 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1567,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate_list != NIL)
+ {
+ list_free_deep(entry->exprstate_list);
+ entry->exprstate_list = NIL;
+ }
}
}
@@ -1607,10 +1685,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
Here are the new v26* patches. This is a refactoring of the row-filter
caches to remove all the logic from the get_rel_sync_entry function
and delay it until if/when needed in the pgoutput_row_filter function.
This is now implemented per Amit's suggestion to move all the cache
code [1]. It is a replacement for the v25* patches.The make check and TAP subscription tests are all OK. I have repeated
the performance tests [2] and those results are good too.v26-0001 <--- v23 (base RF patch)
v26-0002 <--- ExprState cache mods (refactored row filter caching)
v26-0002 <--- ExprState cache extra debug logging (temp)
Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
as soon as I integrate it as part of this recent modification.
I'm attaching a new version that simply including Houzj review [1]/messages/by-id/OS0PR01MB571696CA853B3655F7DE752994E29@OS0PR01MB5716.jpnprd01.prod.outlook.com. This is
based on v23.
There has been a discussion about which row should be used by row filter. We
don't have a unanimous choice, so I think it is prudent to provide a way for
the user to change it. I suggested in a previous email [2]/messages/by-id/5a3f74df-ffa1-4126-a5d8-dbb081d3e439@www.fastmail.com that a publication
option should be added. Hence, row filter can be applied to old tuple, new
tuple, or both. This approach is simpler than using OLD/NEW references (less
code and avoid validation such as NEW reference for DELETEs and OLD reference
for INSERTs). I think about a reasonable default value and it seems _new_ tuple
is a good one because (i) it is always available and (ii) user doesn't have
to figure out that replication is broken due to a column that is not part
of replica identity. I'm attaching a POC that implements it. I'm still
polishing it. Add tests for multiple row filters and integrate Peter's caching
mechanism [3]/messages/by-id/CAHut+PsgRHymwLhJ9t3By6+KNaVDzfjf6Y4Aq=JRD-y8t1mEFg@mail.gmail.com are the next steps.
[1]: /messages/by-id/OS0PR01MB571696CA853B3655F7DE752994E29@OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]: /messages/by-id/5a3f74df-ffa1-4126-a5d8-dbb081d3e439@www.fastmail.com
[3]: /messages/by-id/CAHut+PsgRHymwLhJ9t3By6+KNaVDzfjf6Y4Aq=JRD-y8t1mEFg@mail.gmail.com
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v27-0001-Row-filter-for-logical-replication.patchtext/x-patch; name=v27-0001-Row-filter-for-logical-replication.patchDownload
From 018cdb79733ddf4f0de1e4eace3a172bd685d53c Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH v27 1/2] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 47 ++-
src/backend/commands/publicationcmds.c | 101 ++++---
src/backend/nodes/copyfuncs.c | 14 +
src/backend/nodes/equalfuncs.c | 12 +
src/backend/parser/gram.y | 24 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 ++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 9 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++
26 files changed, 1048 insertions(+), 76 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2b2c70a26e..d473af1b7b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b2c6..4bb4314458 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbca55..8f78fbbd90 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -182,6 +186,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -209,6 +233,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934eba1..94e398133f 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 2a2fe03c13..6057fc3220 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, Relation targetrel,
+publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -177,6 +186,26 @@ publication_add_relation(Oid pubid, Relation targetrel,
check_publication_add_relation(targetrel);
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
+
/* Form a tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,6 +240,14 @@ publication_add_relation(Oid pubid, Relation targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8487eeb7e6..0df7ffbe54 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -384,31 +384,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
+ PublicationRelationInfo *oldrel;
- foreach(newlc, rels)
- {
- Relation newrel = (Relation) lfirst(newlc);
-
- if (RelationGetRelid(newrel) == oldrelid)
- {
- found = true;
- break;
- }
- }
-
- if (!found)
- {
- Relation oldrel = table_open(oldrelid,
- ShareUpdateExclusiveLock);
-
- delrels = lappend(delrels, oldrel);
- }
+ oldrel = palloc(sizeof(PublicationRelationInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -498,7 +491,8 @@ RemovePublicationRelById(Oid proid)
}
/*
- * Open relations specified by a RangeVar list.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
* The returned tables are locked in ShareUpdateExclusiveLock mode in order to
* add them to a publication.
*/
@@ -508,16 +502,38 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelationInfo *pri;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- RangeVar *rv = lfirst_node(RangeVar, lc);
- bool recurse = rv->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ whereclause = IsA(lfirst(lc), PublicationTable);
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
@@ -537,8 +553,11 @@ OpenTableList(List *tables)
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
-
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = myrelid;
+ pri->relation = rel;
+ pri->whereClause = whereclause ? t->whereClause : NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, myrelid);
/*
@@ -571,7 +590,12 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
- rels = lappend(rels, rel);
+ pri = palloc(sizeof(PublicationRelationInfo));
+ pri->relid = childrelid;
+ pri->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pri->whereClause = whereclause ? t->whereClause : NULL;
+ rels = lappend(rels, pri);
relids = lappend_oid(relids, childrelid);
}
}
@@ -592,10 +616,12 @@ CloseTableList(List *rels)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
- table_close(rel, NoLock);
+ table_close(pri->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -611,15 +637,15 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
+ if (!pg_class_ownercheck(pri->relid, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(pri->relation->rd_rel->relkind),
+ RelationGetRelationName(pri->relation));
- obj = publication_add_relation(pubid, rel, if_not_exists);
+ obj = publication_add_relation(pubid, pri, if_not_exists);
if (stmt)
{
EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
@@ -643,11 +669,10 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
- Relation rel = (Relation) lfirst(lc);
- Oid relid = RelationGetRelid(rel);
+ PublicationRelationInfo *pri = (PublicationRelationInfo *) lfirst(lc);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pri->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -657,7 +682,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pri->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 38251c2b8e..291c01bd94 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4828,6 +4828,17 @@ _copyAlterPublicationStmt(const AlterPublicationStmt *from)
return newnode;
}
+static PublicationTable *
+_copyPublicationTable(const PublicationTable *from)
+{
+ PublicationTable *newnode = makeNode(PublicationTable);
+
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
+
+ return newnode;
+}
+
static CreateSubscriptionStmt *
_copyCreateSubscriptionStmt(const CreateSubscriptionStmt *from)
{
@@ -5692,6 +5703,9 @@ copyObjectImpl(const void *from)
case T_AlterPublicationStmt:
retval = _copyAlterPublicationStmt(from);
break;
+ case T_PublicationTable:
+ retval = _copyPublicationTable(from);
+ break;
case T_CreateSubscriptionStmt:
retval = _copyCreateSubscriptionStmt(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8a1762000c..3ec66c48af 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2314,6 +2314,15 @@ _equalAlterPublicationStmt(const AlterPublicationStmt *a,
return true;
}
+static bool
+_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
+
+ return true;
+}
+
static bool
_equalCreateSubscriptionStmt(const CreateSubscriptionStmt *a,
const CreateSubscriptionStmt *b)
@@ -3700,6 +3709,9 @@ equal(const void *a, const void *b)
case T_AlterPublicationStmt:
retval = _equalAlterPublicationStmt(a, b);
break;
+ case T_PublicationTable:
+ retval = _equalPublicationTable(a, b);
+ break;
case T_CreateSubscriptionStmt:
retval = _equalCreateSubscriptionStmt(a, b);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..96c42d8aec 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -426,14 +426,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
transform_element_list transform_type_list
TriggerTransitions TriggerReferencing
vacuum_relation_list opt_vacuum_relation_list
- drop_option_list
+ drop_option_list publication_table_list
%type <node> opt_routine_body
%type <groupclause> group_clause
%type <list> group_by_list
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
-%type <node> opt_publication_for_tables publication_for_tables
+%type <node> opt_publication_for_tables publication_for_tables publication_table_elem
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -9620,7 +9620,7 @@ opt_publication_for_tables:
;
publication_for_tables:
- FOR TABLE relation_expr_list
+ FOR TABLE publication_table_list
{
$$ = (Node *) $3;
}
@@ -9651,7 +9651,7 @@ AlterPublicationStmt:
n->options = $5;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name ADD_P TABLE relation_expr_list
+ | ALTER PUBLICATION name ADD_P TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9659,7 +9659,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_ADD;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name SET TABLE relation_expr_list
+ | ALTER PUBLICATION name SET TABLE publication_table_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
@@ -9677,6 +9677,20 @@ AlterPublicationStmt:
}
;
+publication_table_list:
+ publication_table_elem { $$ = list_make1($1); }
+ | publication_table_list ',' publication_table_elem { $$ = lappend($1, $3); }
+ ;
+
+publication_table_elem: relation_expr OptWhereClause
+ {
+ PublicationTable *n = makeNode(PublicationTable);
+ n->relation = $1;
+ n->whereClause = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
/*****************************************************************************
*
* CREATE SUBSCRIPTION name ...
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a05a9..193c87d8b7 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32311..321050660e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -205,8 +205,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -509,6 +520,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1781,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3104,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de7da..e946f17c64 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23afc..29f8835ce1 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a43c..9d86a10594 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737fd93..1220203af7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -620,6 +638,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 67be849829..1f19ae4384 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4140,6 +4140,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4150,9 +4151,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4161,6 +4169,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4201,6 +4210,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4233,8 +4246,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845ece..f932a704eb 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 90ff649be7..5f6418a572 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f332bad4d4..2703b9c3fe 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,6 +83,13 @@ typedef struct Publication
PublicationActions pubactions;
} Publication;
+typedef struct PublicationRelationInfo
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationInfo;
+
extern Publication *GetPublication(Oid pubid);
extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
extern List *GetRelationPublications(Oid relid);
@@ -108,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelationInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504cbb..154bb61777 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 6a4d82f0a8..56d13ff022 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -490,6 +490,7 @@ typedef enum NodeTag
T_PartitionRangeDatum,
T_PartitionCmd,
T_VacuumRelation,
+ T_PublicationTable,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7af13dee43..875b809099 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3625,12 +3625,19 @@ typedef struct AlterTSConfigurationStmt
} AlterTSConfigurationStmt;
+typedef struct PublicationTable
+{
+ NodeTag type;
+ RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
+} PublicationTable;
+
typedef struct CreatePublicationStmt
{
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3643,7 +3650,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2dd0..4537543a7b 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 4a5ef0bc24..319c6bc7d9 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
ERROR: cannot add relation "testpub_view" to publication
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index d844075368..b1606cce7e 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,38 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- fail - view
CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000000..6428f0da00
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
v27-0002-publication-parameter-row_filter_on_update.patchtext/x-patch; name=v27-0002-publication-parameter-row_filter_on_update.patchDownload
From f00cfd18a7d10254a0539213e38129ed56823869 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Wed, 18 Aug 2021 19:32:23 -0300
Subject: [PATCH v27 2/2] publication parameter: row_filter_on_update
This parameter controls which tuple to be used to evaluate the
expression provided by the WHERE clause on UPDATE operations. The
allowed values are new and old. The default is new.
This patch introduces a new List per entry that contains row filter
data. Hence, it might have two row filters with different
row_filter_on_update to test old and new tuples.
---
doc/src/sgml/ref/create_publication.sgml | 12 ++
src/backend/catalog/pg_publication.c | 1 +
src/backend/commands/publicationcmds.c | 44 ++++++-
src/backend/replication/pgoutput/pgoutput.c | 132 ++++++++++++++------
src/include/catalog/pg_publication.h | 11 ++
src/test/regress/expected/publication.out | 3 +
src/test/regress/sql/publication.sql | 2 +
src/test/subscription/t/025_row_filter.pl | 65 +++++++++-
8 files changed, 224 insertions(+), 46 deletions(-)
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 8f78fbbd90..ba1eac08ce 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -146,6 +146,18 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>row_filter_on_update</literal> (<type>string</type>)</term>
+ <listitem>
+ <para>
+ This parameter controls which tuple to be used to evaluate the
+ expression for <literal>UPDATE</literal> operations. The allowed
+ values are <literal>new</literal> and <literal>old</literal>. The
+ default is to use the new tuple.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist></para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 6057fc3220..77c4f83c7f 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -474,6 +474,7 @@ GetPublication(Oid pubid)
pub->pubactions.pubdelete = pubform->pubdelete;
pub->pubactions.pubtruncate = pubform->pubtruncate;
pub->pubviaroot = pubform->pubviaroot;
+ pub->pubrowfilterupd = pubform->pubrowfilterupd;
ReleaseSysCache(tup);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 0df7ffbe54..00373db654 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -60,12 +60,15 @@ parse_publication_options(ParseState *pstate,
bool *publish_given,
PublicationActions *pubactions,
bool *publish_via_partition_root_given,
- bool *publish_via_partition_root)
+ bool *publish_via_partition_root,
+ bool *row_filter_on_update_given,
+ char *row_filter_on_update)
{
ListCell *lc;
*publish_given = false;
*publish_via_partition_root_given = false;
+ *row_filter_on_update_given = false;
/* defaults */
pubactions->pubinsert = true;
@@ -73,6 +76,7 @@ parse_publication_options(ParseState *pstate,
pubactions->pubdelete = true;
pubactions->pubtruncate = true;
*publish_via_partition_root = false;
+ *row_filter_on_update = PUB_ROW_FILTER_UPD_NEW_TUPLE;
/* Parse options */
foreach(lc, options)
@@ -131,6 +135,24 @@ parse_publication_options(ParseState *pstate,
*publish_via_partition_root_given = true;
*publish_via_partition_root = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "row_filter_on_update") == 0)
+ {
+ char *rowfilterupd;
+
+ if (*row_filter_on_update_given)
+ errorConflictingDefElem(defel, pstate);
+ *row_filter_on_update_given = true;
+ rowfilterupd = defGetString(defel);
+
+ if (strcmp(rowfilterupd, "new") == 0)
+ *row_filter_on_update = PUB_ROW_FILTER_UPD_NEW_TUPLE;
+ else if (strcmp(rowfilterupd, "old") == 0)
+ *row_filter_on_update = PUB_ROW_FILTER_UPD_OLD_TUPLE;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized \"row_filter_on_update\" value: \"%s\"", rowfilterupd)));
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -154,6 +176,8 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt)
PublicationActions pubactions;
bool publish_via_partition_root_given;
bool publish_via_partition_root;
+ bool row_filter_on_update_given;
+ char row_filter_on_update;
AclResult aclresult;
/* must have CREATE privilege on database */
@@ -193,7 +217,9 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt)
stmt->options,
&publish_given, &pubactions,
&publish_via_partition_root_given,
- &publish_via_partition_root);
+ &publish_via_partition_root,
+ &row_filter_on_update_given,
+ &row_filter_on_update);
puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
Anum_pg_publication_oid);
@@ -210,6 +236,8 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt)
BoolGetDatum(pubactions.pubtruncate);
values[Anum_pg_publication_pubviaroot - 1] =
BoolGetDatum(publish_via_partition_root);
+ values[Anum_pg_publication_pubrowfilterupd - 1] =
+ CharGetDatum(row_filter_on_update);
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
@@ -264,6 +292,8 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
PublicationActions pubactions;
bool publish_via_partition_root_given;
bool publish_via_partition_root;
+ bool row_filter_on_update_given;
+ char row_filter_on_update;
ObjectAddress obj;
Form_pg_publication pubform;
@@ -271,7 +301,9 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
stmt->options,
&publish_given, &pubactions,
&publish_via_partition_root_given,
- &publish_via_partition_root);
+ &publish_via_partition_root,
+ &row_filter_on_update_given,
+ &row_filter_on_update);
/* Everything ok, form a new tuple. */
memset(values, 0, sizeof(values));
@@ -299,6 +331,12 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
replaces[Anum_pg_publication_pubviaroot - 1] = true;
}
+ if (row_filter_on_update_given)
+ {
+ values[Anum_pg_publication_pubrowfilterupd - 1] = CharGetDatum(row_filter_on_update);
+ replaces[Anum_pg_publication_pubrowfilterupd - 1] = true;
+ }
+
tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
replaces);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1220203af7..5313cd0b78 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -95,6 +95,19 @@ static void send_repl_origin(LogicalDecodingContext *ctx,
RepOriginId origin_id, XLogRecPtr origin_lsn,
bool send_origin);
+/*
+ * One relation can have multiple row filters. This structure has data for each
+ * row filter including an ExprState and TupleTableSlot for cache purposes and
+ * also a variable that indicates which tuple the row filter uses for UPDATE
+ * actions.
+ */
+typedef struct RowFilterState
+{
+ ExprState *exprstate;
+ TupleTableSlot *scantuple;
+ char row_filter_on_update;
+} RowFilterState;
+
/*
* Entry in the map used to remember which relation schemas we sent.
*
@@ -123,8 +136,7 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ List *rfstate; /* row filter list */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -161,7 +173,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, int action, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,15 +743,14 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(Relation relation, int action, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
- ExprContext *ecxt;
ListCell *lc;
bool result = true;
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->rfstate == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
@@ -750,31 +761,62 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
estate = create_estate_for_relation(relation);
- /* Prepare context per tuple */
- ecxt = GetPerTupleExprContext(estate);
- ecxt->ecxt_scantuple = entry->scantuple;
-
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
-
/*
* If the subscription has multiple publications and the same table has a
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->rfstate)
{
- ExprState *exprstate = (ExprState *) lfirst(lc);
+ ExprContext *ecxt;
+ RowFilterState *rfstate = (RowFilterState *) lfirst(lc);
+
+ /* Bail out if row_filter_on_update = old and old tuple is NULL */
+ if (action == REORDER_BUFFER_CHANGE_UPDATE && oldtuple == NULL &&
+ rfstate->row_filter_on_update == PUB_ROW_FILTER_UPD_OLD_TUPLE)
+ return false;
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = rfstate->scantuple;
+
+ /*
+ * Choose which tuple to use for row filter.
+ * - INSERT: uses new tuple.
+ * - UPDATE: it can use new tuple or old tuple. The behavior is controlled
+ * by the publication parameter row_filter_on_update. The default is new
+ * tuple.
+ * - DELETE: uses old tuple.
+ */
+ switch (action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ ExecStoreHeapTuple(newtuple, ecxt->ecxt_scantuple, false);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (rfstate->row_filter_on_update == PUB_ROW_FILTER_UPD_NEW_TUPLE)
+ ExecStoreHeapTuple(newtuple, ecxt->ecxt_scantuple, false);
+ else if (rfstate->row_filter_on_update == PUB_ROW_FILTER_UPD_OLD_TUPLE)
+ ExecStoreHeapTuple(oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ Assert(false);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ ExecStoreHeapTuple(oldtuple, ecxt->ecxt_scantuple, false);
+ break;
+ }
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+ result = pgoutput_row_filter_exec_expr(rfstate->exprstate, ecxt);
/* If the tuple does not match one of the row filters, bail out */
if (!result)
break;
+
+ ResetExprContext(ecxt);
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -840,7 +882,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, change->action, NULL, tuple, relentry))
break;
/*
@@ -873,7 +915,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(relation, change->action, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, change->action, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1320,8 +1362,7 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
- entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->rfstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1347,21 +1388,26 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
+ foreach(lc, entry->rfstate)
{
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
+ RowFilterState *rfstate = (RowFilterState *) lfirst(lc);
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
+ /* Release tuple table slot */
+ if (rfstate->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(rfstate->scantuple);
+ rfstate->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ rfstate->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
/*
* Build publication cache. We can't use one provided by relcache as
@@ -1447,15 +1493,19 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
if (!rfisnull)
{
- Node *rfnode;
- ExprState *exprstate;
+ Node *rfnode;
+ RowFilterState *rfstate;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfstate = palloc0(sizeof(RowFilterState));
rfnode = stringToNode(TextDatumGetCString(rfdatum));
/* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
+ rfstate->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ rfstate->row_filter_on_update = pub->pubrowfilterupd;
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ rfstate->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->rfstate = lappend(entry->rfstate, rfstate);
MemoryContextSwitchTo(oldctx);
}
@@ -1608,10 +1658,10 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
- if (entry->exprstate != NIL)
+ if (entry->rfstate != NIL)
{
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
+ list_free_deep(entry->rfstate);
+ entry->rfstate = NIL;
}
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 2703b9c3fe..e68591c1fd 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -54,6 +54,12 @@ CATALOG(pg_publication,6104,PublicationRelationId)
/* true if partition changes are published using root schema */
bool pubviaroot;
+
+ /*
+ * Choose which tuple to use for row filter on UPDATE actions.
+ * See constants below.
+ */
+ char pubrowfilterupd;
} FormData_pg_publication;
/* ----------------
@@ -81,8 +87,13 @@ typedef struct Publication
bool alltables;
bool pubviaroot;
PublicationActions pubactions;
+ char pubrowfilterupd;
} Publication;
+/* pubrowfilterupd values */
+#define PUB_ROW_FILTER_UPD_NEW_TUPLE 'n' /* use new tuple */
+#define PUB_ROW_FILTER_UPD_OLD_TUPLE 'o' /* use old tuple */
+
typedef struct PublicationRelationInfo
{
Oid relid;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 319c6bc7d9..1a4ba78033 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -29,6 +29,8 @@ CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publis
ERROR: conflicting or redundant options
LINE 1: ...ub_xxx WITH (publish_via_partition_root = 'true', publish_vi...
^
+CREATE PUBLICATION testpub_xxx WITH (row_filter_on_update = 'foo');
+ERROR: unrecognized "row_filter_on_update" value: "foo"
\dRp
List of publications
Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
@@ -205,6 +207,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+ALTER PUBLICATION testpub5 SET (row_filter_on_update = 'new');
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
ERROR: functions are not allowed in publication WHERE expressions
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index b1606cce7e..352d4482b1 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -24,6 +24,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
CREATE PUBLICATION testpub_xxx WITH (foo);
CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
+CREATE PUBLICATION testpub_xxx WITH (row_filter_on_update = 'foo');
\dRp
@@ -108,6 +109,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
\dRp+ testpub5
+ALTER PUBLICATION testpub5 SET (row_filter_on_update = 'new');
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-- fail - user-defined operators disallowed
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0da00..3ad5de024c 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 7;
+use Test::More tests => 10;
# create publisher node
my $node_publisher = PostgresNode->new('publisher');
@@ -22,6 +22,8 @@ $node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (a int, b integer, primary key(a, b))");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
);
@@ -44,6 +46,8 @@ $node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (a int, b integer, primary key(a, b))");
$node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
);
@@ -82,6 +86,9 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4 FOR TABLE tab_rowfilter_4 WHERE (a < 10 AND b < 40)"
+);
$node_publisher->safe_psql('postgres',
"CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
);
@@ -103,6 +110,8 @@ $node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (a, b) SELECT i, i + 30 FROM generate_series(1, 6) i");
# insert data into partitioned table and directly on the partition
$node_publisher->safe_psql('postgres',
@@ -115,7 +124,7 @@ $node_publisher->safe_psql('postgres',
my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
my $appname = 'tap_sub';
$node_subscriber->safe_psql('postgres',
- "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4"
);
$node_publisher->wait_for_catchup($appname);
@@ -159,6 +168,13 @@ $result =
"SELECT count(a) FROM tab_rowfilter_3");
is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+# Check expected replicated rows for tab_rowfilter_4
+# tap_pub_4 filter is: (a < 10 AND b < 40)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a), min(a), max(a), min(b), max(b) FROM tab_rowfilter_4");
+is($result, qq(6|1|6|31|36), 'check initial data copy from table tab_rowfilter_4');
+
# Check expected replicated rows for partitions
# publication option publish_via_partition_root is false so use the row filter
# from a partition
@@ -210,6 +226,11 @@ $node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
$node_publisher->safe_psql('postgres',
"DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+# publication parameter: row_filter_on_update = new
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_4 SET a = 7, b = 37 WHERE a = 1 AND b = 31");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_4 SET a = 102, b = 132 WHERE a = 2 AND b = 32");
$node_publisher->wait_for_catchup($appname);
@@ -237,6 +258,46 @@ is($result, qq(1001|test 1001
1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+# Check expected replicated rows for tab_rowfilter_4
+# tap_pub_4 filter is: (a < 10 AND b < 40)
+#
+# - UPDATE (7, 37) YES, uses new tuple for row filter
+# - UPDATE (102, 132) NO, uses new tuple for row filter
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_4 ORDER BY 1, 2");
+is($result, qq(2|32
+3|33
+4|34
+5|35
+6|36
+7|37), 'check replicated rows to table tab_rowfilter_4');
+
+# publication parameter: row_filter_on_update = old
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_4 SET (row_filter_on_update = old)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_4 SET a = 8, b = 38 WHERE a = 3 AND b = 33");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_4 SET a = 104, b = 134 WHERE a = 4 AND b = 34");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_4
+# tap_pub_4 filter is: (a < 10 AND b < 40)
+#
+# - UPDATE (8, 38) YES, uses old tuple for row filter
+# - UPDATE (104, 134) YES, uses old tuple for row filter
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_4 ORDER BY 1, 2");
+is($result, qq(2|32
+5|35
+6|36
+7|37
+8|38
+104|134), 'check replicated rows to table tab_rowfilter_4');
+
# Publish using root partitioned table
# Use a different partitioned table layout (exercise publish_via_partition_root)
$node_publisher->safe_psql('postgres',
--
2.20.1
On Wed, Sep 1, 2021 at 4:53 PM Euler Taveira <euler@eulerto.com> wrote:
On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
Here are the new v26* patches. This is a refactoring of the row-filter
caches to remove all the logic from the get_rel_sync_entry function
and delay it until if/when needed in the pgoutput_row_filter function.
This is now implemented per Amit's suggestion to move all the cache
code [1]. It is a replacement for the v25* patches.The make check and TAP subscription tests are all OK. I have repeated
the performance tests [2] and those results are good too.v26-0001 <--- v23 (base RF patch)
v26-0002 <--- ExprState cache mods (refactored row filter caching)
v26-0002 <--- ExprState cache extra debug logging (temp)Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
as soon as I integrate it as part of this recent modification.I'm attaching a new version that simply including Houzj review [1]. This is
based on v23.There has been a discussion about which row should be used by row filter. We
don't have a unanimous choice, so I think it is prudent to provide a way for
the user to change it. I suggested in a previous email [2] that a publication
option should be added. Hence, row filter can be applied to old tuple, new
tuple, or both. This approach is simpler than using OLD/NEW references (less
code and avoid validation such as NEW reference for DELETEs and OLD reference
for INSERTs). I think about a reasonable default value and it seems _new_ tuple
is a good one because (i) it is always available and (ii) user doesn't have
to figure out that replication is broken due to a column that is not part
of replica identity.
I think this or any other similar solution for row filters (on
updates) won't work till we solve the problem reported by Hou-San [1]/messages/by-id/OS0PR01MB571618736E7E79309A723BBE94E99@OS0PR01MB5716.jpnprd01.prod.outlook.com.
The main reason is that we don't have data for unchanged toast columns
in WAL. For that, we have discussed some probable solutions in email
[2]: /messages/by-id/CAA4eK1JLQqNZypOpN7h3=Vt0JJW4Yb_FsLJS=T8J9J-WXgFMYg@mail.gmail.com
bugs[3]/messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/OS0PR01MB571618736E7E79309A723BBE94E99@OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]: /messages/by-id/CAA4eK1JLQqNZypOpN7h3=Vt0JJW4Yb_FsLJS=T8J9J-WXgFMYg@mail.gmail.com
[3]: /messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
On Wed, Sep 1, 2021, at 9:36 AM, Amit Kapila wrote:
I think this or any other similar solution for row filters (on
updates) won't work till we solve the problem reported by Hou-San [1].
The main reason is that we don't have data for unchanged toast columns
in WAL. For that, we have discussed some probable solutions in email
[2], however, that also required us to solve one of the existing
bugs[3].
I didn't mention but I'm working on it in parallel.
I agree with you that including TOAST values in the WAL is a possible solution
for this issue. This is a popular request for wal2json [1]https://github.com/eulerto/wal2json/issues/205[2]https://github.com/eulerto/wal2json/issues/132[3]https://github.com/eulerto/wal2json/issues/42 and I think
other output plugins have the same request too. It is useful for CDC solutions.
I'm experimenting 2 approaches: (i) always include unchanged TOAST values to
new tuple if a GUC is set and (ii) include unchanged TOAST values to new tuple
iif it wasn't include in the old tuple. The advantage of the first option is
that you fix the problem adjusting a parameter in your configuration file.
However, the disadvantage is that, depending on your setup -- REPLICA IDENTITY
FULL, you might have the same TOAST value for a single change twice in the WAL.
The second option solves the disadvantage of (i) but it only works if you have
REPLICA IDENTITY FULL and Dilip's patch applied [4]/messages/by-id/CAFiTN-uW50w0tWoUBg_VYCdvNeCzT=t=JzhmiFd452FrLOwMMQ@mail.gmail.com (I expect to review it
soon). In the output plugin, (i) requires a simple modification (remove
restriction for unchanged TOAST values) but (ii) needs a more complex surgery.
[1]: https://github.com/eulerto/wal2json/issues/205
[2]: https://github.com/eulerto/wal2json/issues/132
[3]: https://github.com/eulerto/wal2json/issues/42
[4]: /messages/by-id/CAFiTN-uW50w0tWoUBg_VYCdvNeCzT=t=JzhmiFd452FrLOwMMQ@mail.gmail.com
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Sep 1, 2021 at 8:29 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Sep 1, 2021, at 9:36 AM, Amit Kapila wrote:
I think this or any other similar solution for row filters (on
updates) won't work till we solve the problem reported by Hou-San [1].
The main reason is that we don't have data for unchanged toast columns
in WAL. For that, we have discussed some probable solutions in email
[2], however, that also required us to solve one of the existing
bugs[3].I didn't mention but I'm working on it in parallel.
I agree with you that including TOAST values in the WAL is a possible solution
for this issue. This is a popular request for wal2json [1][2][3] and I think
other output plugins have the same request too. It is useful for CDC solutions.I'm experimenting 2 approaches: (i) always include unchanged TOAST values to
new tuple if a GUC is set and (ii) include unchanged TOAST values to new tuple
iif it wasn't include in the old tuple.
In the second approach, we will always end up having unchanged toast
columns for non-key columns in the WAL which will be a significant
overhead, so not sure if that can be acceptable if we want to do it by
default.
The advantage of the first option is
that you fix the problem adjusting a parameter in your configuration file.
However, the disadvantage is that, depending on your setup -- REPLICA IDENTITY
FULL, you might have the same TOAST value for a single change twice in the WAL.
The second option solves the disadvantage of (i) but it only works if you have
REPLICA IDENTITY FULL and Dilip's patch applied [4] (I expect to review it
soon).
Thanks for offering the review of that patch. I think it will be good
to get it committed.
In the output plugin, (i) requires a simple modification (remove
restriction for unchanged TOAST values) but (ii) needs a more complex surgery.
I think if get Dilip's patch then we can have a rule for filter
columns such that it can contain only replica identity key columns.
This rule is anyway required for Deletes and we can have it for
Updates. At this stage, I haven't checked what it takes to implement
such a solution but it would be worth investigating it.
--
With Regards,
Amit Kapila.
On Thu, Sep 2, 2021 at 1:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
I think if get Dilip's patch then we can have a rule for filter
columns such that it can contain only replica identity key columns.
This rule is anyway required for Deletes and we can have it for
Updates. At this stage, I haven't checked what it takes to implement
such a solution but it would be worth investigating it.
Yes, I have been experimenting with part of this puzzle. I have
implemented already some POC code to extract the list of table columns
contained within the row filter expression. I can share it after I
clean it up some more if that is helpful.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Wed, Sep 1, 2021 at 9:23 PM Euler Taveira <euler@eulerto.com> wrote:
On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
...
Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
as soon as I integrate it as part of this recent modification.
Hi Euler, for your next version can you please also integrate the
tab-autocomplete change back into the main patch.
This autocomplete change was originally posted quite a few weeks ago
here [1]/messages/by-id/CAHut+PuLoZuHD_A=n8GshC84Nc=8guReDsTmV1RFsCYojssD8Q@mail.gmail.com but seems to have gone overlooked.
I've rebased it and it applied OK to your latest v27* set. PSA.
Thanks!
------
[1]: /messages/by-id/CAHut+PuLoZuHD_A=n8GshC84Nc=8guReDsTmV1RFsCYojssD8Q@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v1-0001-Add-tab-auto-complete-support-for-the-Row-Filter-.patchapplication/octet-stream; name=v1-0001-Add-tab-auto-complete-support-for-the-Row-Filter-.patchDownload
From 0b8730deb928b49aa93bb5d7dab42245e1a90de4 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 3 Sep 2021 11:51:06 +1000
Subject: [PATCH v1] Add tab auto-complete support for the Row Filter WHERE.
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 75b8676..c33e114 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
On Wed, Sep 1, 2021 at 9:23 PM Euler Taveira <euler@eulerto.com> wrote:
On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
Here are the new v26* patches. This is a refactoring of the row-filter
caches to remove all the logic from the get_rel_sync_entry function
and delay it until if/when needed in the pgoutput_row_filter function.
This is now implemented per Amit's suggestion to move all the cache
code [1]. It is a replacement for the v25* patches.The make check and TAP subscription tests are all OK. I have repeated
the performance tests [2] and those results are good too.v26-0001 <--- v23 (base RF patch)
v26-0002 <--- ExprState cache mods (refactored row filter caching)
v26-0002 <--- ExprState cache extra debug logging (temp)Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
as soon as I integrate it as part of this recent modification.I'm attaching a new version that simply including Houzj review [1]. This is
based on v23.There has been a discussion about which row should be used by row filter. We
don't have a unanimous choice, so I think it is prudent to provide a way for
the user to change it. I suggested in a previous email [2] that a publication
option should be added. Hence, row filter can be applied to old tuple, new
tuple, or both. This approach is simpler than using OLD/NEW references (less
code and avoid validation such as NEW reference for DELETEs and OLD reference
for INSERTs). I think about a reasonable default value and it seems _new_ tuple
is a good one because (i) it is always available and (ii) user doesn't have
to figure out that replication is broken due to a column that is not part
of replica identity. I'm attaching a POC that implements it. I'm still
polishing it. Add tests for multiple row filters and integrate Peter's caching
mechanism [3] are the next steps.
Assuming this _new_tuple option is enabled and
1. An UPDATE, where the new_tuple satisfies the row filter, but the
old_tuple did not (not checked). Since the row filter check passed
but the actual row never existed on the subscriber, would this patch
convert the UPDATE to an INSERT or would this UPDATE be ignored? Based
on the tests that I did, I see that it is ignored.
2. An UPDATE where the new tuple does not satisfy the row filter but
the old_tuple did. Since the new_tuple did not match the row filter,
wouldn't this row now remain divergent on the replica?
Somehow this approach of either new_tuple or old_tuple doesn't seem to
be very fruitful if the user requires that his replica is up-to-date
based on the filter condition. For that, I think you will need to
convert UPDATES to either INSERTS or DELETES if only new_tuple or
old_tuple matches the filter condition but not both matches the filter
condition.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
regards,
Ajin Cherian
Fujitsu Australia
Hi Euler,
As you probably know the "base" Row-Filter 27-0001 got seriously
messed up by a recent commit that had lots of overlaps with your code
[1]: https://github.com/postgres/postgres/commit/0c6828fa987b791744b9c8685aadf1baa21f8977#
e.g. It broke trying to apply on HEAD as follows:
[postgres@CentOS7-x64 oss_postgres_RowFilter]$ git apply
v27-0001-Row-filter-for-logical-replication.patch
error: patch failed: src/backend/catalog/pg_publication.c:141
error: src/backend/catalog/pg_publication.c: patch does not apply
error: patch failed: src/backend/commands/publicationcmds.c:384
error: src/backend/commands/publicationcmds.c: patch does not apply
error: patch failed: src/backend/parser/gram.y:426
error: src/backend/parser/gram.y: patch does not apply
error: patch failed: src/include/catalog/pg_publication.h:83
error: src/include/catalog/pg_publication.h: patch does not apply
error: patch failed: src/include/nodes/nodes.h:490
error: src/include/nodes/nodes.h: patch does not apply
error: patch failed: src/include/nodes/parsenodes.h:3625
error: src/include/nodes/parsenodes.h: patch does not apply
error: patch failed: src/test/regress/expected/publication.out:158
error: src/test/regress/expected/publication.out: patch does not apply
error: patch failed: src/test/regress/sql/publication.sql:93
error: src/test/regress/sql/publication.sql: patch does not apply
~~
I know you are having discussions in the other (Col-Filtering) thread
about the names PublicationRelationInfo versus PublicationRelInfo etc,
but meanwhile, I am in need of a working "base" Row-Filter patch so
that I can post my incremental work, and so that the cfbot can
continue to run ok.
Since your v27 has been broken for several days already I've taken it
upon myself to re-base it. PSA.
v27-0001 --> v28-0001.
(AFAIK this new v28 applies ok and passes all regression and TAP
subscription tests)
Note: This v28 patch was made only so that I can (soon) post some
other small incremental patches on top of it, and also so the cfbot
will be able to run them ok. If you do not like it then just overwrite
it - I am happy to work with whatever latest "base" patch you provide
so long as it is compatible with the current master code.
------
[1]: https://github.com/postgres/postgres/commit/0c6828fa987b791744b9c8685aadf1baa21f8977#
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v28-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v28-0001-Row-filter-for-logical-replication.patchDownload
From 083fb5c10e77749a4f76e23ff65fb7ef0e9248d3 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 8 Sep 2021 22:02:44 +1000
Subject: [PATCH v28] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 53 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 33 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1000 insertions(+), 79 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f0def9..1fca628 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d6fddd6..a1ea0f8 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -172,10 +181,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,11 +240,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel->relation);
+ CacheInvalidateRelcache(targetrel);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 30929da..f10539c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -389,38 +389,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -539,9 +525,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -549,22 +536,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -581,7 +592,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -616,7 +632,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -643,6 +665,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -663,7 +687,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -692,11 +716,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -706,7 +728,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e308de1..f3a73cb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4945,6 +4945,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 99440b4..f85d4a2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3118,6 +3118,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6a0f465..c0ac3a5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..3210506 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -205,8 +205,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -509,6 +520,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1781,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3104,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 67be849..1f19ae4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4140,6 +4140,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4150,9 +4151,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4161,6 +4169,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4201,6 +4210,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4233,8 +4246,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 90ff649..5f6418a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 561266a..f748434 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -113,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 743e5aa..599d5cd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3628,6 +3628,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3635,7 +3636,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3648,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index cad1b37..156f14c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 04b34ee..331b821 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,39 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
PSA my new incremental patch (v28-0002) that introduces row filter
validation for the publish mode "delete". The validation requires that
any columns referred to in the filter expression must also be part of
REPLICA IDENTITY or PK.
[This v28-0001 is identical to the most recently posted rebased base
patch. It is included again here only so the cfbot will be happy]
~~
A requirement for some filter validation like this has been mentioned
several times in this thread [1]/messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com[2]/messages/by-id/CAA4eK1JL2q+HENgiCf1HLRU7nD9jCcttB9sEqV1tech4mMv_0A@mail.gmail.com[3]/messages/by-id/202107132106.wvjgvjgcyezo@alvherre.pgsql[4]/messages/by-id/202107141452.edncq4ot5zkg@alvherre.pgsql[5]/messages/by-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g@mail.gmail.com.
I also added some test code for various kinds of replica identity.
A couple of existing tests had to be modified so they could continue
to work (e.g. changed publish = "insert" or REPLICA IDENTITY FULL)
Feedback is welcome.
~~
NOTE: This validation currently only checks when the filters are first
created. Probably there are many other scenarios that need to be
properly handled. What to do if something which impacts the existing
filter is changed?
e.g.
- what if the user changes the publish parameter using ALTER
PUBLICATION set (publish="delete") etc?
- what if the user changes the replication identity?
- what if the user changes the filter using ALTER PUBLICATION in a way
that is no longer compatible with the necessary cols?
- what if the user changes the table (e.g. removes a column referred
to by a filter)?
- what if the user changes a referred column name?
- more...
(None of those are addressed yet - thoughts?)
------
[1]: /messages/by-id/92e5587d-28b8-5849-2374-5ca3863256f1@2ndquadrant.com
[2]: /messages/by-id/CAA4eK1JL2q+HENgiCf1HLRU7nD9jCcttB9sEqV1tech4mMv_0A@mail.gmail.com
[3]: /messages/by-id/202107132106.wvjgvjgcyezo@alvherre.pgsql
[4]: /messages/by-id/202107141452.edncq4ot5zkg@alvherre.pgsql
[5]: /messages/by-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v28-0002-Row-filter-validation-replica-identity.patchapplication/octet-stream; name=v28-0002-Row-filter-validation-replica-identity.patchDownload
From e28824fae114277a394ab7193fa0ab6337329eb9 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 9 Sep 2021 13:12:17 +1000
Subject: [PATCH v28] Row filter validation - replica identity
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 58 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 74 +++++++++++++++++++++++
src/include/catalog/dependency.h | 6 ++
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 78 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 310 insertions(+), 10 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..e81f093 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,64 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index a1ea0f8..ff2f28d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -139,6 +139,77 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/*
+ * Walk the parse-tree to decide if the row-filter is valid or not.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
/*
* Insert new publication / relation mapping.
@@ -204,6 +275,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..2c7310e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,12 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 156f14c..aa97e4d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -163,13 +163,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -179,7 +181,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -190,7 +192,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -201,7 +203,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -229,6 +231,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 331b821..8552b36 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -98,7 +98,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -125,6 +127,80 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v28-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v28-0001-Row-filter-for-logical-replication.patchDownload
From cef8b8f81f01c1e2cd5c8d332ef192252ce55398 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 8 Sep 2021 22:02:44 +1000
Subject: [PATCH v28] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 53 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 33 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1000 insertions(+), 79 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f0def9..1fca628 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d6fddd6..a1ea0f8 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -172,10 +181,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,11 +240,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel->relation);
+ CacheInvalidateRelcache(targetrel);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 30929da..f10539c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -389,38 +389,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -539,9 +525,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -549,22 +536,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -581,7 +592,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -616,7 +632,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -643,6 +665,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -663,7 +687,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -692,11 +716,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -706,7 +728,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e308de1..f3a73cb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4945,6 +4945,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 99440b4..f85d4a2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3118,6 +3118,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6a0f465..c0ac3a5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..3210506 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -205,8 +205,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -509,6 +520,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1781,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3104,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2febcd4..a60b369 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4140,6 +4140,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4150,9 +4151,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4161,6 +4169,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4201,6 +4210,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4233,8 +4246,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 90ff649..5f6418a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 561266a..f748434 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -113,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 743e5aa..599d5cd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3628,6 +3628,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3635,7 +3636,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3648,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index cad1b37..156f14c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 04b34ee..331b821 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,39 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Thu, Sep 9, 2021 at 11:43 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA my new incremental patch (v28-0002) that introduces row filter
validation for the publish mode "delete". The validation requires that
any columns referred to in the filter expression must also be part of
REPLICA IDENTITY or PK.[This v28-0001 is identical to the most recently posted rebased base
patch. It is included again here only so the cfbot will be happy]~~
A requirement for some filter validation like this has been mentioned
several times in this thread [1][2][3][4][5].I also added some test code for various kinds of replica identity.
A couple of existing tests had to be modified so they could continue
to work (e.g. changed publish = "insert" or REPLICA IDENTITY FULL)Feedback is welcome.
~~
NOTE: This validation currently only checks when the filters are first
created. Probably there are many other scenarios that need to be
properly handled. What to do if something which impacts the existing
filter is changed?e.g.
- what if the user changes the publish parameter using ALTER
PUBLICATION set (publish="delete") etc?
- what if the user changes the replication identity?
- what if the user changes the filter using ALTER PUBLICATION in a way
that is no longer compatible with the necessary cols?
- what if the user changes the table (e.g. removes a column referred
to by a filter)?
- what if the user changes a referred column name?
- more...(None of those are addressed yet - thoughts?)
I think we need to remove the filter or the table from publication in
such cases. Now, one can think of just removing the condition related
to the column being removed/changed in some way but I think that won't
be appropriate because it would change the meaning of the filter. We
are discussing similar stuff in the column filter thread and we might
want to do the same for row filters as well. I would prefer to remove
the table in both cases as Rahila has proposed in the column filter
patch.
--
With Regards,
Amit Kapila.
I have attached a POC row-filter validation patch implemented using a
parse-tree 'walker' function.
PSA the incremental patch v28-0003.
v28-0001 --> v28-0001 (same as before - base patch)
v28-0002 --> v28-0002 (same as before - replica identity validation patch)
v28-0003 (NEW POC PATCH using "walker" validation)
~~
This kind of 'walker' validation has been proposed/recommended already
several times up-thread. [1]/messages/by-id/33c033f7-be44-e241-5fdf-da1b328c288d@enterprisedb.com[2]/messages/by-id/CAA4eK1Jumuio6jZK8AVQd6z7gpDsZydQhK6d=MUARxk3nS7+Pw@mail.gmail.com[3]/messages/by-id/CAA4eK1JL2q+HENgiCf1HLRU7nD9jCcttB9sEqV1tech4mMv_0A@mail.gmail.com.
For this POC patch, I have removed all the existing
EXPR_KIND_PUBLICATION_WHERE parser errors. I am not 100% sure this is
the best idea (see below), but for now, the parser errors are
temporarily #if 0 in the code. I will clean up this patch and re-post
later when there is some feedback/consensus on how to proceed.
~
1. PROS
1.1 Using a 'walker' validator allows the row filter expression
validation to be 'opt-in' instead of 'opt-out' checking logic. This
may be considered *safer* because now we can have a very
controlled/restricted set of allowed nodes - e.g. only allow simple
(Var op Const) expressions. This eliminates the risk that some
unforeseen dangerous loophole could be exploited.
1.2 It is convenient to have all the row-filter validation errors in
one place, instead of being scattered across the parser code based on
EXPR_KIND_PUBLICATION_WHERE. Indeed, there seems some confusion
already caused by the existing scattering of row-filter validation
(patch 0001). For example, I found some of the new "aggregate
functions are not allowed" errors are not even reachable because they
are shielded by the earlier "functions are not allowed" error.
2. CONS
2.1 Error messages thrown from the parser can include the character
location of the problem. Actually, this is also possible using the
'walker' (I have done it locally) but it requires passing the
ParseState into the walker code - something I thought seemed a bit
unusual, so I did not include that in this 0003 POC patch.
~~
Perhaps a hybrid validation is preferred. e.g. retain some/all of the
parser validation errors from the 0001 patch, but also keep the walker
validation as a 'catch-all' to trap anything unforeseen that may slip
through the parsing. Or perhaps this 'walker' validator is fine as the
only validator and all the current parser errors for
EXPR_KIND_PUBLICATION_WHERE can just be permanently removed.
I am not sure what is the best approach, so I am hoping for some
feedback and/or review comments.
------
[1]: /messages/by-id/33c033f7-be44-e241-5fdf-da1b328c288d@enterprisedb.com
[2]: /messages/by-id/CAA4eK1Jumuio6jZK8AVQd6z7gpDsZydQhK6d=MUARxk3nS7+Pw@mail.gmail.com
[3]: /messages/by-id/CAA4eK1JL2q+HENgiCf1HLRU7nD9jCcttB9sEqV1tech4mMv_0A@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v28-0002-Row-filter-validation-replica-identity.patchapplication/octet-stream; name=v28-0002-Row-filter-validation-replica-identity.patchDownload
From e28824fae114277a394ab7193fa0ab6337329eb9 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 9 Sep 2021 13:12:17 +1000
Subject: [PATCH v28] Row filter validation - replica identity
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 58 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 74 +++++++++++++++++++++++
src/include/catalog/dependency.h | 6 ++
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 78 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 310 insertions(+), 10 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..e81f093 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,64 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index a1ea0f8..ff2f28d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -139,6 +139,77 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/*
+ * Walk the parse-tree to decide if the row-filter is valid or not.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
/*
* Insert new publication / relation mapping.
@@ -204,6 +275,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..2c7310e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,12 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 156f14c..aa97e4d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -163,13 +163,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -179,7 +181,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -190,7 +192,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -201,7 +203,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -229,6 +231,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 331b821..8552b36 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -98,7 +98,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -125,6 +127,80 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v28-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v28-0001-Row-filter-for-logical-replication.patchDownload
From cef8b8f81f01c1e2cd5c8d332ef192252ce55398 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 8 Sep 2021 22:02:44 +1000
Subject: [PATCH v28] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 53 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 33 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1000 insertions(+), 79 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f0def9..1fca628 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 702934e..94e3981 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -102,7 +102,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether the existing data in the publications that are
being subscribed to should be copied once the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d6fddd6..a1ea0f8 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -172,10 +181,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,11 +240,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel->relation);
+ CacheInvalidateRelcache(targetrel);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 30929da..f10539c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -389,38 +389,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -539,9 +525,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -549,22 +536,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -581,7 +592,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -616,7 +632,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -643,6 +665,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -663,7 +687,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -692,11 +716,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -706,7 +728,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e308de1..f3a73cb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4945,6 +4945,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 99440b4..f85d4a2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3118,6 +3118,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6a0f465..c0ac3a5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f928c32..3210506 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -205,8 +205,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -509,6 +520,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1769,6 +1781,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3089,6 +3104,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2febcd4..a60b369 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4140,6 +4140,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4150,9 +4151,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4161,6 +4169,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4201,6 +4210,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4233,8 +4246,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 90ff649..5f6418a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 561266a..f748434 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -113,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 743e5aa..599d5cd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3628,6 +3628,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3635,7 +3636,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3648,7 +3649,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 1500de2..4537543 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index cad1b37..156f14c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 04b34ee..331b821 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,39 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v28-0003-POC-row-filter-walker-validation.patchapplication/octet-stream; name=v28-0003-POC-row-filter-walker-validation.patchDownload
From c7e121cbc345a23fe4f8f41aea956b84d800750a Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 14 Sep 2021 12:27:03 +1000
Subject: [PATCH v28] POC row-filter walker validation
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filer expression are permitted. Specifially:
- no user-defined operators.
- no functions.
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr
This POC patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to the modified validation error messages.
---
src/backend/catalog/dependency.c | 68 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 14 +++++--
src/backend/parser/parse_agg.c | 5 ++-
src/backend/parser/parse_expr.c | 6 ++-
src/backend/parser/parse_func.c | 3 ++
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 17 +++++---
src/test/regress/sql/publication.sql | 2 +
9 files changed, 107 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index e81f093..405b3cd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,68 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Nothing more complicated is permitted. Specifically, no functions of any kind
+ * and no user-defined operators.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ forbidden = _("function calls are not allowed");
+ }
+ else
+ {
+ elog(DEBUG1, "row filter contained something unexpected: %s", nodeToString(node));
+ forbidden = _("too complex");
+ }
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden),
+ errhint("only simple expressions using columns and constants are allowed")
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return what
* is found as a list of RfCol. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index ff2f28d..21ac56a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -143,7 +143,7 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Walk the parse-tree to decide if the row-filter is valid or not.
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
@@ -151,6 +151,14 @@ rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
/*
* Rule:
*
+ * Walk the parse-tree and reject anything more complicated than a very
+ * simple expression.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule:
+ *
* If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in
* the row-filter WHERE clause.
@@ -271,13 +279,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3210506..36b7e53 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -206,6 +206,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -214,6 +215,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1782,7 +1784,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3105,7 +3109,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2c7310e..dd69aff 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -156,6 +155,7 @@ typedef struct RfCol {
int attnum;
} RfCol;
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index aa97e4d..237396e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -209,16 +209,21 @@ Tables:
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl4"
+DETAIL: function calls are not allowed
+HINT: only simple expressions using columns and constants are allowed
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: too complex
+HINT: only simple expressions using columns and constants are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8552b36..faf7450 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -116,6 +116,8 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
--
1.8.3.1
Hi Euler,
FYI - the last-known-good "main" patch has been broken in the cfbot
for the last couple of days due to a recent commit [1]https://github.com/postgres/postgres/commit/1882d6cca161dcf9fa05ecab5abeb1a027a5cfd2 on the HEAD.
To keep the cfbot happy I have re-based it.
In this same post (so that they will not be misplaced and so they
remain working with HEAD) I am also re-attaching all of my currently
pending "incremental" patches. These are either awaiting merge back
into the "main" patch and/or they are awaiting review.
~
PSA 5 patches:
v29-0001 = the latest "main" patch (was
v28-0001-Row-filter-for-logical-replication.patch from [2]/messages/by-id/CAHut+Pv-Gz_bA6djDOnTz0OT-fMykKwidsK6bLDU5mZ1KWX9KQ@mail.gmail.com) is now
rebased to HEAD.
v29-0002 = my tab auto-complete patch (was
v1-0001-Add-tab-auto-complete-support-for-the-Row-Filter-.patch from
[3]: /messages/by-id/CAHut+Psi7EygLemHnQbdLSZhBqyxqHY-3Mov1RS5xFAR=xg-wg@mail.gmail.com
v29-0003 = my cache updates patch (was
v26-0002-ExprState-cache-modifications.patch from [4]/messages/by-id/CAHut+PsgRHymwLhJ9t3By6+KNaVDzfjf6Y4Aq=JRD-y8t1mEFg@mail.gmail.com) awaiting merge.
v29-0004 = my filter validation replica identity patch (was
v28-0002-Row-filter-validation-replica-identity.patch from [5]/messages/by-id/CAHut+PukNh_HsN1Au1p9YhG5KCOr3dH5jnwm=RmeX75BOtXTEg@mail.gmail.com)
awaiting review/merge.
v29-0005 = my filter validation walker POC patch (was
v28-0003-POC-row-filter-walker-validation.patch from [6]/messages/by-id/CAHut+Pt6+=w7_r=CHBCS+yZXk5V+tnrzHLi3b2ZOVP1LHL2W9w@mail.gmail.com) awaiting
feedback.
~
It is getting increasingly time-consuming to maintain and track all
these separate patches. If possible, please merge them back into the
"main" patch.
------
[1]: https://github.com/postgres/postgres/commit/1882d6cca161dcf9fa05ecab5abeb1a027a5cfd2
[2]: /messages/by-id/CAHut+Pv-Gz_bA6djDOnTz0OT-fMykKwidsK6bLDU5mZ1KWX9KQ@mail.gmail.com
[3]: /messages/by-id/CAHut+Psi7EygLemHnQbdLSZhBqyxqHY-3Mov1RS5xFAR=xg-wg@mail.gmail.com
[4]: /messages/by-id/CAHut+PsgRHymwLhJ9t3By6+KNaVDzfjf6Y4Aq=JRD-y8t1mEFg@mail.gmail.com
[5]: /messages/by-id/CAHut+PukNh_HsN1Au1p9YhG5KCOr3dH5jnwm=RmeX75BOtXTEg@mail.gmail.com
[6]: /messages/by-id/CAHut+Pt6+=w7_r=CHBCS+yZXk5V+tnrzHLi3b2ZOVP1LHL2W9w@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v29-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v29-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From a5dbaa22d0c9c5d56d7fd56b956acd0a2ace035b Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 16:40:39 +1000
Subject: [PATCH v29] PS - Add tab auto-complete support for the Row Filter
WHERE.
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5cd5838..8686ec6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
v29-0005-PS-POC-Row-filter-validation-walker.patchapplication/octet-stream; name=v29-0005-PS-POC-Row-filter-validation-walker.patchDownload
From 5565d0a1b00acf75e24beae708bb20e20a7bd6f2 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 18:29:52 +1000
Subject: [PATCH v29] PS - POC Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filer expression are permitted. Specifially:
- no user-defined operators.
- no functions.
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr
This POC patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to the modified validation error messages.
---
src/backend/catalog/dependency.c | 68 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 14 +++++--
src/backend/parser/parse_agg.c | 5 ++-
src/backend/parser/parse_expr.c | 6 ++-
src/backend/parser/parse_func.c | 3 ++
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 17 +++++---
src/test/regress/sql/publication.sql | 2 +
9 files changed, 107 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index e81f093..405b3cd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,68 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Nothing more complicated is permitted. Specifically, no functions of any kind
+ * and no user-defined operators.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ forbidden = _("function calls are not allowed");
+ }
+ else
+ {
+ elog(DEBUG1, "row filter contained something unexpected: %s", nodeToString(node));
+ forbidden = _("too complex");
+ }
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden),
+ errhint("only simple expressions using columns and constants are allowed")
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return what
* is found as a list of RfCol. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index ff2f28d..21ac56a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -143,7 +143,7 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Walk the parse-tree to decide if the row-filter is valid or not.
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
@@ -151,6 +151,14 @@ rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
/*
* Rule:
*
+ * Walk the parse-tree and reject anything more complicated than a very
+ * simple expression.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule:
+ *
* If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in
* the row-filter WHERE clause.
@@ -271,13 +279,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2c7310e..dd69aff 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -156,6 +155,7 @@ typedef struct RfCol {
int attnum;
} RfCol;
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index aa97e4d..237396e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -209,16 +209,21 @@ Tables:
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl4"
+DETAIL: function calls are not allowed
+HINT: only simple expressions using columns and constants are allowed
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: too complex
+HINT: only simple expressions using columns and constants are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8552b36..faf7450 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -116,6 +116,8 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
--
1.8.3.1
v29-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v29-0003-PS-ExprState-cache-modifications.patchDownload
From afc4e60c427ee4f6d7edd0810f9a9c05d49bd6f0 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 17:43:55 +1000
Subject: [PATCH v29] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1220203..ce5e1c5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means that exprstate_list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache then in the
+ * entry->exprstate_list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstate_list == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstate_list)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1318,10 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1333,7 +1443,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1347,22 +1456,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1372,9 +1465,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1434,33 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1567,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate_list != NIL)
+ {
+ list_free_deep(entry->exprstate_list);
+ entry->exprstate_list = NIL;
+ }
}
}
@@ -1607,12 +1685,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v29-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v29-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 788ce062547161ec3ce0fa5082a834dce6e85d52 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 18:06:54 +1000
Subject: [PATCH v29] PS - Row filter validation of replica identity
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 58 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 74 +++++++++++++++++++++++
src/include/catalog/dependency.h | 6 ++
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 78 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 310 insertions(+), 10 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..e81f093 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,64 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index a1ea0f8..ff2f28d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -139,6 +139,77 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/*
+ * Walk the parse-tree to decide if the row-filter is valid or not.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
/*
* Insert new publication / relation mapping.
@@ -204,6 +275,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..2c7310e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,12 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 156f14c..aa97e4d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -163,13 +163,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -179,7 +181,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -190,7 +192,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -201,7 +203,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -229,6 +231,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 331b821..8552b36 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -98,7 +98,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -125,6 +127,80 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v29-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v29-0001-Row-filter-for-logical-replication.patchDownload
From 942c0168bed60846933fbab735528287686c6fa1 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 16:32:39 +1000
Subject: [PATCH v29] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 53 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 33 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1000 insertions(+), 79 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f0def9..1fca628 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d6fddd6..a1ea0f8 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -172,10 +181,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,11 +240,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel->relation);
+ CacheInvalidateRelcache(targetrel);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 30929da..f10539c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -389,38 +389,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -539,9 +525,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -549,22 +536,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -581,7 +592,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -616,7 +632,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -643,6 +665,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -663,7 +687,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -692,11 +716,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -706,7 +728,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 228387e..a69e131 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4964,6 +4964,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 800f588..7a33695 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3137,6 +3137,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e3068a3..9765aeb 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a485fb2..0f4892c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,6 +4141,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4151,9 +4152,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4162,6 +4170,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4202,6 +4211,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4234,8 +4247,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 90ff649..5f6418a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 561266a..f748434 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -113,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 45e4f2a..765c656 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3640,6 +3640,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3647,7 +3648,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3660,7 +3661,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index cad1b37..156f14c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 04b34ee..331b821 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,39 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Wed, Sep 8, 2021 at 7:59 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Wed, Sep 1, 2021 at 9:23 PM Euler Taveira <euler@eulerto.com> wrote:
Somehow this approach of either new_tuple or old_tuple doesn't seem to
be very fruitful if the user requires that his replica is up-to-date
based on the filter condition. For that, I think you will need to
convert UPDATES to either INSERTS or DELETES if only new_tuple or
old_tuple matches the filter condition but not both matches the filter
condition.UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Adding a patch that strives to do the logic that I described above.
For updates, the row filter is applied on both old_tuple
and new_tuple. This patch assumes that the row filter only uses
columns that are part of the REPLICA IDENTITY. (the current patch-set
only
restricts this for row-filters that are delete only)
The old_tuple only has columns that are part of the old_tuple and have
been changed, which is a problem while applying the row-filter. Since
unchanged REPLICA IDENTITY columns
are not present in the old_tuple, this patch creates a temporary
old_tuple by getting such column values from the new_tuple and then
applies the filter on this hand-created temp old_tuple. The way the
old_tuple is created can be better optimised in future versions.
This patch also handles the problem reported by Houz in [1]/messages/by-id/OS0PR01MB571618736E7E79309A723BBE94E99@OS0PR01MB5716.jpnprd01.prod.outlook.com. The patch
assumes a fix proposed by Dilip in [2]/messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com regards, Ajin Cherian Fujitsu Australia. This is the case
where toasted unchanged RI columns are not detoasted in the new_tuple
and has to be retrieved from disk during decoding. Dilip's fix
involved updating the detoasted value in the old_tuple when writing to
WAL. In the problem reported by Hou, when the row filter
is applied on the new_tuple and the decoder attempts to detoast the
value in the new_tuple and if the table was deleted at that time, the
decode fails.
To avoid this, in such a situation, the untoasted value in the
old_tuple (fix by Dilip) is copied to the new_tuple before the
row_filter is applied.
I have also refactored the way Peter initializes the row_filter by
moving it into a separate function before the insert/update/delete
specific logic is applied.
I have not changed any of the first 5 patches, just added my patch 006
at the end. Do let me know of any comments on this approach.
[1]: /messages/by-id/OS0PR01MB571618736E7E79309A723BBE94E99@OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]: /messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com regards, Ajin Cherian Fujitsu Australia
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v29-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v29-0001-Row-filter-for-logical-replication.patchDownload
From 942c0168bed60846933fbab735528287686c6fa1 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 16:32:39 +1000
Subject: [PATCH v29] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 53 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 33 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1000 insertions(+), 79 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f0def9..1fca628 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d6fddd6..a1ea0f8 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,6 +33,9 @@
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -141,21 +144,27 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -172,10 +181,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -189,6 +218,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -205,11 +240,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
/* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcache(targetrel->relation);
+ CacheInvalidateRelcache(targetrel);
return myself;
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 30929da..f10539c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -389,38 +389,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -539,9 +525,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -549,22 +536,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -581,7 +592,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -616,7 +632,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -643,6 +665,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -663,7 +687,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -692,11 +716,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -706,7 +728,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 228387e..a69e131 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4964,6 +4964,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 800f588..7a33695 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3137,6 +3137,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e3068a3..9765aeb 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a485fb2..0f4892c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,6 +4141,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4151,9 +4152,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4162,6 +4170,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4202,6 +4211,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4234,8 +4247,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 90ff649..5f6418a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 561266a..f748434 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -113,7 +115,7 @@ extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 45e4f2a..765c656 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3640,6 +3640,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3647,7 +3648,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3660,7 +3661,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index cad1b37..156f14c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -158,6 +158,77 @@ Tables:
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 04b34ee..331b821 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -93,6 +93,39 @@ ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
DROP TABLE testpub_parted1;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v29-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v29-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From a5dbaa22d0c9c5d56d7fd56b956acd0a2ace035b Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 16:40:39 +1000
Subject: [PATCH v29] PS - Add tab auto-complete support for the Row Filter
WHERE.
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5cd5838..8686ec6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
v29-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v29-0003-PS-ExprState-cache-modifications.patchDownload
From afc4e60c427ee4f6d7edd0810f9a9c05d49bd6f0 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 17:43:55 +1000
Subject: [PATCH v29] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1220203..ce5e1c5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means that exprstate_list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache then in the
+ * entry->exprstate_list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstate_list == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstate_list)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1318,10 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1333,7 +1443,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1347,22 +1456,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1372,9 +1465,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1434,33 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1567,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate_list != NIL)
+ {
+ list_free_deep(entry->exprstate_list);
+ entry->exprstate_list = NIL;
+ }
}
}
@@ -1607,12 +1685,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v29-0005-PS-POC-Row-filter-validation-walker.patchapplication/octet-stream; name=v29-0005-PS-POC-Row-filter-validation-walker.patchDownload
From 5565d0a1b00acf75e24beae708bb20e20a7bd6f2 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 18:29:52 +1000
Subject: [PATCH v29] PS - POC Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filer expression are permitted. Specifially:
- no user-defined operators.
- no functions.
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr
This POC patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to the modified validation error messages.
---
src/backend/catalog/dependency.c | 68 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 14 +++++--
src/backend/parser/parse_agg.c | 5 ++-
src/backend/parser/parse_expr.c | 6 ++-
src/backend/parser/parse_func.c | 3 ++
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 17 +++++---
src/test/regress/sql/publication.sql | 2 +
9 files changed, 107 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index e81f093..405b3cd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,68 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Nothing more complicated is permitted. Specifically, no functions of any kind
+ * and no user-defined operators.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ forbidden = _("function calls are not allowed");
+ }
+ else
+ {
+ elog(DEBUG1, "row filter contained something unexpected: %s", nodeToString(node));
+ forbidden = _("too complex");
+ }
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden),
+ errhint("only simple expressions using columns and constants are allowed")
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return what
* is found as a list of RfCol. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index ff2f28d..21ac56a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -143,7 +143,7 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Walk the parse-tree to decide if the row-filter is valid or not.
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
@@ -151,6 +151,14 @@ rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
/*
* Rule:
*
+ * Walk the parse-tree and reject anything more complicated than a very
+ * simple expression.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule:
+ *
* If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in
* the row-filter WHERE clause.
@@ -271,13 +279,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2c7310e..dd69aff 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -156,6 +155,7 @@ typedef struct RfCol {
int attnum;
} RfCol;
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index aa97e4d..237396e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -209,16 +209,21 @@ Tables:
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl4"
+DETAIL: function calls are not allowed
+HINT: only simple expressions using columns and constants are allowed
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: too complex
+HINT: only simple expressions using columns and constants are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8552b36..faf7450 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -116,6 +116,8 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
--
1.8.3.1
v29-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v29-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 788ce062547161ec3ce0fa5082a834dce6e85d52 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Sep 2021 18:06:54 +1000
Subject: [PATCH v29] PS - Row filter validation of replica identity
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 58 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 74 +++++++++++++++++++++++
src/include/catalog/dependency.h | 6 ++
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 78 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 310 insertions(+), 10 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..e81f093 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,64 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index a1ea0f8..ff2f28d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -139,6 +139,77 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/*
+ * Walk the parse-tree to decide if the row-filter is valid or not.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
/*
* Insert new publication / relation mapping.
@@ -204,6 +275,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..2c7310e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,12 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 156f14c..aa97e4d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -163,13 +163,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -179,7 +181,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -190,7 +192,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -201,7 +203,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -229,6 +231,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 331b821..8552b36 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -98,7 +98,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -125,6 +127,80 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v29-0006-support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v29-0006-support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From b2ab430f4ac14c608f9d1bf678863fd01850453b Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Mon, 20 Sep 2021 05:10:42 -0400
Subject: [PATCH v29] support updates based on old and new tuple in row filters
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
---
src/backend/replication/pgoutput/pgoutput.c | 159 +++++++++++++++++++++++++---
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
3 files changed, 152 insertions(+), 17 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce5e1c5..18c6cbf 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -167,10 +167,14 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +738,110 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+ {
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ Datum *values_old = (Datum *) palloc(desc->natts * sizeof(Datum));
+ Datum *values_new = (Datum *) palloc(desc->natts * sizeof(Datum));
+ bool *isnull_old = (bool *) palloc(desc->natts * sizeof(bool));
+ bool *isnull_new = (bool *) palloc(desc->natts * sizeof(bool));
+ HeapTuple tmpoldtuple;
+ HeapTuple tmpnewtuple;
+
+ /*
+ * We need to apply the row filter on both the old tuple and the new tuple.
+ * But the old tuple only has changed columns that are part of the replica identity.
+ * To complete the set of replica identity columns in the old tuple, copy over the
+ * columns from the new tuple. Also, unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, values_new, isnull_new);
+ heap_deform_tuple(oldtuple, desc, values_old, isnull_old);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (isnull_new[i])
+ continue;
+
+ if (isnull_old[i])
+ {
+ values_old[i] = values_new[i];
+ isnull_old[i] = false;
+ }
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(values_new[i])) &&
+ (!isnull_old[i] && !(VARATT_IS_EXTERNAL_ONDISK(values_old[i]))))
+ values_new[i] = values_old[i];
+ }
+ tmpoldtuple = heap_form_tuple(desc, values_old, isnull_old);
+ tmpnewtuple = heap_form_tuple(desc, values_new, isnull_new);
+
+ old_matched = pgoutput_row_filter(relation, NULL, tmpoldtuple, entry);
+ new_matched = pgoutput_row_filter(relation, NULL, tmpnewtuple, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ }
+ else if (old_matched && !new_matched)
+ {
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ }
+ else if (new_matched && !old_matched)
+ {
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ }
+
+ return true;
+ }
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -846,6 +942,21 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstate_list == NIL)
@@ -941,6 +1052,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1063,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1094,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,6 +1121,25 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
+
+ switch(modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
logicalrep_write_update(ctx->out, xid, relation, oldtuple,
newtuple, data->binary);
OutputPluginWrite(ctx, true);
@@ -1016,7 +1151,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dc9becc..742bbbe 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Mon, Sep 20, 2021 at 3:17 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Wed, Sep 8, 2021 at 7:59 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Wed, Sep 1, 2021 at 9:23 PM Euler Taveira <euler@eulerto.com> wrote:
Somehow this approach of either new_tuple or old_tuple doesn't seem to
be very fruitful if the user requires that his replica is up-to-date
based on the filter condition. For that, I think you will need to
convert UPDATES to either INSERTS or DELETES if only new_tuple or
old_tuple matches the filter condition but not both matches the filter
condition.UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)Adding a patch that strives to do the logic that I described above.
For updates, the row filter is applied on both old_tuple
and new_tuple. This patch assumes that the row filter only uses
columns that are part of the REPLICA IDENTITY. (the current patch-set
only
restricts this for row-filters that are delete only)
The old_tuple only has columns that are part of the old_tuple and have
been changed, which is a problem while applying the row-filter. Since
unchanged REPLICA IDENTITY columns
are not present in the old_tuple, this patch creates a temporary
old_tuple by getting such column values from the new_tuple and then
applies the filter on this hand-created temp old_tuple. The way the
old_tuple is created can be better optimised in future versions.
Yeah, this is the kind of idea which can work. One thing you might
want to check is the overhead of the additional deform/form cycle. You
might want to use Peter's tests above. I think you need to only form
old/new tuples when you have changed something in it but on a quick
look, it seems you are always re-forming both the tuples.
--
With Regards,
Amit Kapila.
On Mon, Sep 20, 2021 at 5:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Adding a patch that strives to do the logic that I described above.
For updates, the row filter is applied on both old_tuple
and new_tuple. This patch assumes that the row filter only uses
columns that are part of the REPLICA IDENTITY. (the current patch-set
only
restricts this for row-filters that are delete only)
The old_tuple only has columns that are part of the old_tuple and have
been changed, which is a problem while applying the row-filter. Since
unchanged REPLICA IDENTITY columns
are not present in the old_tuple, this patch creates a temporary
old_tuple by getting such column values from the new_tuple and then
applies the filter on this hand-created temp old_tuple. The way the
old_tuple is created can be better optimised in future versions.
I understand why this is done, but I have 2 concerns here 1) We are
having extra deform and copying the field from new to old in case it
is unchanged replica identity. 2) The same unchanged attribute values
get qualified in the old tuple as well as in the new tuple. What
exactly needs to be done is that the only updated field should be
validated as part of the old as well as the new tuple, the unchanged
field does not make sense to have redundant validation. For that we
will have to change the filter for the old tuple to just validate the
attributes which are actually modified and remaining unchanged and new
values will anyway get validated in the new tuple.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Sep 21, 2021 at 12:03 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Sep 20, 2021 at 5:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Adding a patch that strives to do the logic that I described above.
For updates, the row filter is applied on both old_tuple
and new_tuple. This patch assumes that the row filter only uses
columns that are part of the REPLICA IDENTITY. (the current patch-set
only
restricts this for row-filters that are delete only)
The old_tuple only has columns that are part of the old_tuple and have
been changed, which is a problem while applying the row-filter. Since
unchanged REPLICA IDENTITY columns
are not present in the old_tuple, this patch creates a temporary
old_tuple by getting such column values from the new_tuple and then
applies the filter on this hand-created temp old_tuple. The way the
old_tuple is created can be better optimised in future versions.I understand why this is done, but I have 2 concerns here 1) We are
having extra deform and copying the field from new to old in case it
is unchanged replica identity. 2) The same unchanged attribute values
get qualified in the old tuple as well as in the new tuple. What
exactly needs to be done is that the only updated field should be
validated as part of the old as well as the new tuple, the unchanged
field does not make sense to have redundant validation. For that we
will have to change the filter for the old tuple to just validate the
attributes which are actually modified and remaining unchanged and new
values will anyway get validated in the new tuple.
But what if the filter expression depends on multiple columns, say (a+b) > 100
where a is unchanged while b is changed. Then we will still need both
columns for applying
the filter even though one is unchanged. Also, I am not aware of any
mechanism by which
we can apply a filter expression on individual attributes. The current
mechanism does it
on a tuple. Do let me know if you have any ideas there?
Even if it were done, there would still be the overhead of deforming the tuple.
I will run some performance tests like Amit suggested and see what the
overhead is and
try to minimise it.
regards,
Ajin Cherian
Fujitsu Australia
On Tue, Sep 21, 2021 at 8:58 AM Ajin Cherian <itsajin@gmail.com> wrote:
I understand why this is done, but I have 2 concerns here 1) We are
having extra deform and copying the field from new to old in case it
is unchanged replica identity. 2) The same unchanged attribute values
get qualified in the old tuple as well as in the new tuple. What
exactly needs to be done is that the only updated field should be
validated as part of the old as well as the new tuple, the unchanged
field does not make sense to have redundant validation. For that we
will have to change the filter for the old tuple to just validate the
attributes which are actually modified and remaining unchanged and new
values will anyway get validated in the new tuple.But what if the filter expression depends on multiple columns, say (a+b) > 100
where a is unchanged while b is changed. Then we will still need both
columns for applying
In such a case, we need to.
the filter even though one is unchanged. Also, I am not aware of any
mechanism by which
we can apply a filter expression on individual attributes. The current
mechanism does it
on a tuple. Do let me know if you have any ideas there?
What I suggested is to modify the filter for the old tuple, e.g.
filter is (a > 10 and b < 20 and c+d = 20), now only if a and c are
modified then we can process the expression and we can transform this
filter to (a > 10 and c+d=20).
Even if it were done, there would still be the overhead of deforming the tuple.
Suppose filter is just (a > 10 and b < 20) and only if the a is
updated, and if we are able to modify the filter for the oldtuple to
be just (a>10) then also do we need to deform? Even if we have to we
can save a lot on avoiding duplicate expression evaluation.
I will run some performance tests like Amit suggested and see what the
overhead is and
try to minimise it.
It is good to know, I think you must try with some worst-case
scenarios, e.g. we have 10 text column and 1 int column in the REPLICA
IDENTITY and only the int column get updated and all the text column
are not updated, and you have a filter on all the columns.
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Sep 21, 2021 at 9:54 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Sep 21, 2021 at 8:58 AM Ajin Cherian <itsajin@gmail.com> wrote:
I understand why this is done, but I have 2 concerns here 1) We are
having extra deform and copying the field from new to old in case it
is unchanged replica identity. 2) The same unchanged attribute values
get qualified in the old tuple as well as in the new tuple. What
exactly needs to be done is that the only updated field should be
validated as part of the old as well as the new tuple, the unchanged
field does not make sense to have redundant validation. For that we
will have to change the filter for the old tuple to just validate the
attributes which are actually modified and remaining unchanged and new
values will anyway get validated in the new tuple.But what if the filter expression depends on multiple columns, say (a+b) > 100
where a is unchanged while b is changed. Then we will still need both
columns for applyingIn such a case, we need to.
the filter even though one is unchanged. Also, I am not aware of any
mechanism by which
we can apply a filter expression on individual attributes. The current
mechanism does it
on a tuple. Do let me know if you have any ideas there?What I suggested is to modify the filter for the old tuple, e.g.
filter is (a > 10 and b < 20 and c+d = 20), now only if a and c are
modified then we can process the expression and we can transform this
filter to (a > 10 and c+d=20).
If you have only a and c in the old tuple, how will it evaluate
expression c + d? I think the point is if for some expression some
values are in old tuple and others are in new then the idea proposed
in the patch seems sane. Moreover, I think in your idea for each tuple
we might need to build a new expression and sometimes twice that will
beat the purpose of cache we have kept in the patch and I am not sure
if it is less costly.
See another example where splitting filter might not give desired results:
Say filter expression: (a = 10 and b = 20 and c = 30)
Now, old_tuple has values for columns a and c and say values are 10
and 30. So, the old_tuple will match the filter if we split it as per
your suggestion. Now say new_tuple has values (a = 5, b = 15, c = 25).
In such a situation dividing the filter will give us the result that
the old_tuple is matching but new tuple is not matching which seems
incorrect. I think dividing filter conditions among old and new tuples
might not retain its sanctity.
Even if it were done, there would still be the overhead of deforming the tuple.
Suppose filter is just (a > 10 and b < 20) and only if the a is
updated, and if we are able to modify the filter for the oldtuple to
be just (a>10) then also do we need to deform?
Without deforming, how will you determine which columns are part of
the old tuple?
--
With Regards,
Amit Kapila.
On Tue, Sep 21, 2021 at 10:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
If you have only a and c in the old tuple, how will it evaluate
expression c + d?
Well, what I told is that if we have such dependency then we will have
to copy that field to the old tuple, e.g. if we convert the filter for
the old tuple from (a > 10 and b < 20 and c+d = 20) to (a > 10 and
c+d=20), then we will not have to copy 'b' to the old tuple but we
still have to copy 'd' because there is a dependency.
I think the point is if for some expression some
values are in old tuple and others are in new then the idea proposed
in the patch seems sane. Moreover, I think in your idea for each tuple
we might need to build a new expression and sometimes twice that will
beat the purpose of cache we have kept in the patch and I am not sure
if it is less costly.
Basically, expression initialization should happen only once in most
cases so with my suggestion you might have to do it twice. But the
overhead of extra expression evaluation is far less than doing
duplicate evaluation because that will happen for sending each update
operation right?
See another example where splitting filter might not give desired results:
Say filter expression: (a = 10 and b = 20 and c = 30)
Now, old_tuple has values for columns a and c and say values are 10
and 30. So, the old_tuple will match the filter if we split it as per
your suggestion. Now say new_tuple has values (a = 5, b = 15, c = 25).
In such a situation dividing the filter will give us the result that
the old_tuple is matching but new tuple is not matching which seems
incorrect. I think dividing filter conditions among old and new tuples
might not retain its sanctity.
Yeah that is a good example to apply a duplicate filter, basically
some filters might not even get evaluated on new tuples as the above
example and if we have removed such expression on the other tuple we
might break something. Maybe for now this suggest that we might not
be able to avoid the duplicate execution of the expression
Even if it were done, there would still be the overhead of deforming the tuple.
Suppose filter is just (a > 10 and b < 20) and only if the a is
updated, and if we are able to modify the filter for the oldtuple to
be just (a>10) then also do we need to deform?Without deforming, how will you determine which columns are part of
the old tuple?
Okay, then we might have to deform, but at least are we ensuring that
once we have deform the tuple for the expression evaluation then we
are not doing that again while sending the tuple?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Sep 21, 2021 at 11:16 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Sep 21, 2021 at 10:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think the point is if for some expression some
values are in old tuple and others are in new then the idea proposed
in the patch seems sane. Moreover, I think in your idea for each tuple
we might need to build a new expression and sometimes twice that will
beat the purpose of cache we have kept in the patch and I am not sure
if it is less costly.Basically, expression initialization should happen only once in most
cases so with my suggestion you might have to do it twice.
No, the situation will be that we might have to do it twice per update
where as now, it is just done at the very first operation on a
relation.
But the
overhead of extra expression evaluation is far less than doing
duplicate evaluation because that will happen for sending each update
operation right?
Expression evaluation has to be done twice because every update can
have a different set of values in the old and new tuple.
See another example where splitting filter might not give desired results:
Say filter expression: (a = 10 and b = 20 and c = 30)
Now, old_tuple has values for columns a and c and say values are 10
and 30. So, the old_tuple will match the filter if we split it as per
your suggestion. Now say new_tuple has values (a = 5, b = 15, c = 25).
In such a situation dividing the filter will give us the result that
the old_tuple is matching but new tuple is not matching which seems
incorrect. I think dividing filter conditions among old and new tuples
might not retain its sanctity.Yeah that is a good example to apply a duplicate filter, basically
some filters might not even get evaluated on new tuples as the above
example and if we have removed such expression on the other tuple we
might break something.
Right.
Maybe for now this suggest that we might not
be able to avoid the duplicate execution of the expression
So, IIUC, you agreed that let's proceed with the proposed approach and
we can later do optimizations if possible or if we get better ideas.
Even if it were done, there would still be the overhead of deforming the tuple.
Suppose filter is just (a > 10 and b < 20) and only if the a is
updated, and if we are able to modify the filter for the oldtuple to
be just (a>10) then also do we need to deform?Without deforming, how will you determine which columns are part of
the old tuple?Okay, then we might have to deform, but at least are we ensuring that
once we have deform the tuple for the expression evaluation then we
are not doing that again while sending the tuple?
I think this is possible but we might want to be careful not to send
extra unchanged values as we are doing now.
--
With Regards,
Amit Kapila.
On Tue, Sep 21, 2021 at 2:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Sep 21, 2021 at 11:16 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Sep 21, 2021 at 10:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think the point is if for some expression some
values are in old tuple and others are in new then the idea proposed
in the patch seems sane. Moreover, I think in your idea for each tuple
we might need to build a new expression and sometimes twice that will
beat the purpose of cache we have kept in the patch and I am not sure
if it is less costly.Basically, expression initialization should happen only once in most
cases so with my suggestion you might have to do it twice.No, the situation will be that we might have to do it twice per update
where as now, it is just done at the very first operation on a
relation.
Yeah right. Actually, I mean it will not get initialized for decoding
each tuple, so instead of once it will be done twice, but anyway now
we agree that we can not proceed in this direction because of the
issue you pointed out.
Maybe for now this suggest that we might not
be able to avoid the duplicate execution of the expressionSo, IIUC, you agreed that let's proceed with the proposed approach and
we can later do optimizations if possible or if we get better ideas.
Make sense.
Okay, then we might have to deform, but at least are we ensuring that
once we have deform the tuple for the expression evaluation then we
are not doing that again while sending the tuple?I think this is possible but we might want to be careful not to send
extra unchanged values as we are doing now.
Right.
Some more comments,
In pgoutput_row_filter_update(), first, we are deforming the tuple in
local datum, then modifying the tuple, and then reforming the tuple.
I think we can surely do better here. Currently, you are reforming
the tuple so that you can store it in the scan slot by calling
ExecStoreHeapTuple which will be used for expression evaluation.
Instead of that what you need to do is to deform the tuple using
tts_values of the scan slot and later call ExecStoreVirtualTuple(), so
advantages are 1) you don't need to reform the tuple 2) the expression
evaluation machinery doesn't need to deform again for fetching the
value of the attribute, instead it can directly get from the value
from the virtual tuple.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Sep 21, 2021 at 4:29 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Some more comments,
In pgoutput_row_filter_update(), first, we are deforming the tuple in
local datum, then modifying the tuple, and then reforming the tuple.
I think we can surely do better here. Currently, you are reforming
the tuple so that you can store it in the scan slot by calling
ExecStoreHeapTuple which will be used for expression evaluation.
Instead of that what you need to do is to deform the tuple using
tts_values of the scan slot and later call ExecStoreVirtualTuple(), so
advantages are 1) you don't need to reform the tuple 2) the expression
evaluation machinery doesn't need to deform again for fetching the
value of the attribute, instead it can directly get from the value
from the virtual tuple.
I have one more question, while looking into the
ExtractReplicaIdentity() function, it seems that if any of the "rep
ident key" fields is changed then we will write all the key fields in
the WAL as part of the old tuple, not just the changed fields. That
means either the old tuple will be NULL or it will be having all the
key attributes. So if we are supporting filter only on the "rep ident
key fields" then is there any need to copy the fields from the new
tuple to the old tuple?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Sep 21, 2021 at 9:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Sep 21, 2021 at 4:29 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Some more comments,
In pgoutput_row_filter_update(), first, we are deforming the tuple in
local datum, then modifying the tuple, and then reforming the tuple.
I think we can surely do better here. Currently, you are reforming
the tuple so that you can store it in the scan slot by calling
ExecStoreHeapTuple which will be used for expression evaluation.
Instead of that what you need to do is to deform the tuple using
tts_values of the scan slot and later call ExecStoreVirtualTuple(), so
advantages are 1) you don't need to reform the tuple 2) the expression
evaluation machinery doesn't need to deform again for fetching the
value of the attribute, instead it can directly get from the value
from the virtual tuple.I have one more question, while looking into the
ExtractReplicaIdentity() function, it seems that if any of the "rep
ident key" fields is changed then we will write all the key fields in
the WAL as part of the old tuple, not just the changed fields. That
means either the old tuple will be NULL or it will be having all the
key attributes. So if we are supporting filter only on the "rep ident
key fields" then is there any need to copy the fields from the new
tuple to the old tuple?
Yes, I just figured this out while testing. So we don't need to copy fields
from the new tuple to the old tuple.
But there is still the case of your fix for the unchanged toasted RI
key fields in the new tuple
which needs to be copied from the old tuple to the new tuple. This
particular case
seems to violate both rules that an old tuple will be present only
when there are changed
RI key fields and that if there is an old tuple it will contain all RI
key fields. I think we
still need to deform both old tuple and new tuple, just to handle this case.
There is currently logic in ReorderBufferToastReplace() which already
deforms the new tuple
to detoast changed toasted fields in the new tuple. I think if we can
enhance this logic for our
purpose, then we can avoid an extra deform of the new tuple.
But I think you had earlier indicated that having untoasted unchanged
values in the new tuple
can be bothersome.
Any suggestions?
regards,
Ajin Cherian
Fujitsu Australia
regards,
Ajin Cherian
Fujitsu Australia
On Wed, Sep 22, 2021 at 6:42 AM Ajin Cherian <itsajin@gmail.com> wrote:
On Tue, Sep 21, 2021 at 9:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Sep 21, 2021 at 4:29 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have one more question, while looking into the
ExtractReplicaIdentity() function, it seems that if any of the "rep
ident key" fields is changed then we will write all the key fields in
the WAL as part of the old tuple, not just the changed fields. That
means either the old tuple will be NULL or it will be having all the
key attributes. So if we are supporting filter only on the "rep ident
key fields" then is there any need to copy the fields from the new
tuple to the old tuple?Yes, I just figured this out while testing. So we don't need to copy fields
from the new tuple to the old tuple.But there is still the case of your fix for the unchanged toasted RI
key fields in the new tuple
which needs to be copied from the old tuple to the new tuple. This
particular case
seems to violate both rules that an old tuple will be present only
when there are changed
RI key fields and that if there is an old tuple it will contain all RI
key fields.
Why do you think that the second assumption (if there is an old tuple
it will contain all RI key fields.) is broken? It seems to me even
when we are planning to include unchanged toast as part of old_key, it
will contain all the key columns, isn't that true?
I think we
still need to deform both old tuple and new tuple, just to handle this case.
Yeah, but we will anyway talking about saving that cost for later if
we decide to send that tuple. I think we can further try to optimize
it by first checking whether the new tuple has any toasted value, if
so then only we need this extra pass of deforming.
There is currently logic in ReorderBufferToastReplace() which already
deforms the new tuple
to detoast changed toasted fields in the new tuple. I think if we can
enhance this logic for our
purpose, then we can avoid an extra deform of the new tuple.
But I think you had earlier indicated that having untoasted unchanged
values in the new tuple
can be bothersome.
I think it will be too costly on the subscriber side during apply
because it will update all the unchanged toasted values which will
lead to extra writes both for WAL and data.
--
With Regards,
Amit Kapila.
On Wed, Sep 22, 2021 at 1:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Sep 22, 2021 at 6:42 AM Ajin Cherian <itsajin@gmail.com> wrote:
Why do you think that the second assumption (if there is an old tuple
it will contain all RI key fields.) is broken? It seems to me even
when we are planning to include unchanged toast as part of old_key, it
will contain all the key columns, isn't that true?
Yes, I assumed wrongly. Just checked. What you say is correct.
I think we
still need to deform both old tuple and new tuple, just to handle this case.Yeah, but we will anyway talking about saving that cost for later if
we decide to send that tuple. I think we can further try to optimize
it by first checking whether the new tuple has any toasted value, if
so then only we need this extra pass of deforming.
Ok, I will go ahead with this approach.
There is currently logic in ReorderBufferToastReplace() which already
deforms the new tuple
to detoast changed toasted fields in the new tuple. I think if we can
enhance this logic for our
purpose, then we can avoid an extra deform of the new tuple.
But I think you had earlier indicated that having untoasted unchanged
values in the new tuple
can be bothersome.I think it will be too costly on the subscriber side during apply
because it will update all the unchanged toasted values which will
lead to extra writes both for WAL and data.
Ok, agreed.
regards,
Ajin Cherian
Fujitsu Australia
On Wed, Sep 22, 2021 at 9:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Sep 22, 2021 at 6:42 AM Ajin Cherian <itsajin@gmail.com> wrote:
On Tue, Sep 21, 2021 at 9:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Sep 21, 2021 at 4:29 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
I have one more question, while looking into the
ExtractReplicaIdentity() function, it seems that if any of the "rep
ident key" fields is changed then we will write all the key fields in
the WAL as part of the old tuple, not just the changed fields. That
means either the old tuple will be NULL or it will be having all the
key attributes. So if we are supporting filter only on the "rep ident
key fields" then is there any need to copy the fields from the new
tuple to the old tuple?Yes, I just figured this out while testing. So we don't need to copy fields
from the new tuple to the old tuple.But there is still the case of your fix for the unchanged toasted RI
key fields in the new tuple
which needs to be copied from the old tuple to the new tuple.
Yes, we will have to do that.
There is currently logic in ReorderBufferToastReplace() which already
deforms the new tuple
to detoast changed toasted fields in the new tuple. I think if we can
enhance this logic for our
purpose, then we can avoid an extra deform of the new tuple.
But I think you had earlier indicated that having untoasted unchanged
values in the new tuple
can be bothersome.I think it will be too costly on the subscriber side during apply
because it will update all the unchanged toasted values which will
lead to extra writes both for WAL and data.
Right we should not do that.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Hi,
I finally had time to take a closer look at the patch again, so here's
some review comments. The thread is moving fast, so chances are some of
the comments are obsolete or were already raised in the past.
1) I wonder if we should use WHERE or WHEN to specify the expression.
WHERE is not wrong, but WHEN (as used in triggers) might be better.
2) create_publication.sgml says:
A <literal>NULL</literal> value causes the expression to evaluate
to false; avoid using columns without not-null constraints in the
<literal>WHERE</literal> clause.
That's not quite correct, I think - doesn't the expression evaluate to
NULL (which is not TRUE, so it counts as mismatch)?
I suspect this whole paragraph (talking about NULL in old/new rows)
might be a bit too detailed / low-level for user docs.
3) create_subscription.sgml
<literal>WHERE</literal> clauses, rows must satisfy all expressions
to be copied. If the subscriber is a
I'm rather skeptical about the principle that all expressions have to
match - I'd have expected exactly the opposite behavior, actually.
I see a subscription as "a union of all publications". Imagine for
example you have a data set for all customers, and you create a
publication for different parts of the world, like
CREATE PUBLICATION customers_france
FOR TABLE customers WHERE (country = 'France');
CREATE PUBLICATION customers_germany
FOR TABLE customers WHERE (country = 'Germany');
CREATE PUBLICATION customers_usa
FOR TABLE customers WHERE (country = 'USA');
and now you want to subscribe to multiple publications, because you want
to replicate data for multiple countries (e.g. you want EU countries).
But if you do
CREATE SUBSCRIPTION customers_eu
PUBLICATION customers_france, customers_germany;
then you won't get anything, because each customer belongs to just a
single country. Yes, I could create multiple individual subscriptions,
one for each country, but that's inefficient and may have a different
set of issues (e.g. keeping them in sync when a customer moves between
countries).
I might have missed something, but I haven't found any explanation why
the requirement to satisfy all expressions is the right choice.
IMHO this should be 'satisfies at least one expression' i.e. we should
connect the expressions by OR, not AND.
4) pg_publication.c
It's a bit suspicious we're adding includes for parser to a place where
there were none before. I wonder if this might indicate some layering
issue, i.e. doing something in the wrong place ...
5) publicationcmds.c
I mentioned this in my last review [1]/messages/by-id/849ee491-bba3-c0ae-cc25-4fce1c03f105@enterprisedb.com already, but I really dislike the
fact that OpenTableList accepts a list containing one of two entirely
separate node types (PublicationTable or Relation). It was modified to
use IsA() instead of a flag, but I still find it ugly, confusing and
possibly error-prone.
Also, not sure mentioning the two different callers explicitly in the
OpenTableList comment is a great idea - it's likely to get stale if
someone adds another caller.
6) parse_oper.c
I'm having some second thoughts about (not) allowing UDFs ...
Yes, I get that if the function starts failing, e.g. because querying a
dropped table or something, that breaks the replication and can't be
fixed without a resync.
That's pretty annoying, but maybe disallowing anything user-defined
(functions and operators) is maybe overly anxious? Also, extensibility
is one of the hallmarks of Postgres, and disallowing all custom UDF and
operators seems to contradict that ...
Perhaps just explaining that the expression can / can't do in the docs,
with clear warnings of the risks, would be acceptable.
7) exprstate_list
I'd just call the field / variable "exprstates", without indicating the
data type. I don't think we do that anywhere.
8) RfCol
Do we actually need this struct? Why not to track just name or attnum,
and lookup the other value in syscache when needed?
9) rowfilter_expr_checker
* Walk the parse-tree to decide if the row-filter is valid or not.
I don't see any clear explanation what does "valid" mean.
10) WHERE expression vs. data type
Seem ATExecAlterColumnType might need some changes, because changing a
data type for column referenced by the expression triggers this:
test=# alter table t alter COLUMN c type text;
ERROR: unexpected object depending on column: publication of
table t in publication p
11) extra (unnecessary) parens in the deparsed expression
test=# alter publication p add table t where ((b < 100) and (c < 100));
ALTER PUBLICATION
test=# \dRp+ p
Publication p
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
user | f | t | t | t | t | f
Tables:
"public.t" WHERE (((b < 100) AND (c < 100)))
12) WHERE expression vs. changing replica identity
Peter Smith already mentioned this in [3]/messages/by-id/CAHut+PukNh_HsN1Au1p9YhG5KCOr3dH5jnwm=RmeX75BOtXTEg@mail.gmail.com, but there's a bunch of places
that need to check the expression vs. replica identity. Consider for
example this:
test=# alter publication p add table t where (b < 100);
ERROR: cannot add relation "t" to publication
DETAIL: Row filter column "b" is not part of the REPLICA IDENTITY
test=# alter table t replica identity full;
ALTER TABLE
test=# alter publication p add table t where (b < 100);
ALTER PUBLICATION
test=# alter table t replica identity using INDEX t_pkey ;
ALTER TABLE
Which means the expression is not covered by the replica identity.
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.
But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.
People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).
IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.
13) turning update into insert
I agree with Ajin Cherian [4]/messages/by-id/CAFPTHDb7bpkuc4SxaL9B5vEvF2aEi0EOERdrG+xgVeAyMJsF=Q@mail.gmail.com that looking at just old or new row for
updates is not the right solution, because each option will "break" the
replica in some case. So I think the goal "keeping the replica in sync"
is the right perspective, and converting the update to insert/delete if
needed seems appropriate.
This seems a somewhat similar to what pglogical does, because that may
also convert updates (although only to inserts, IIRC) when handling
replication conflicts. The difference is pglogical does all this on the
subscriber, while this makes the decision on the publisher.
I wonder if this might have some negative consequences, or whether
"moving" this to downstream would be useful for other purposes in the
fuure (e.g. it might be reused for handling other conflicts).
14) pgoutput_row_filter_update
The function name seems a bit misleading, as it suggests might seem like
it updates the row_filter, or something. Should indicate it's about
deciding what to do with the update.
15) pgoutput_row_filter initializing filter
I'm not sure I understand why the filter initialization gets moved from
get_rel_sync_entry. Presumably, most of what the replication does is
replicating rows, so I see little point in not initializing this along
with the rest of the rel_sync_entry.
regards
[1]: /messages/by-id/849ee491-bba3-c0ae-cc25-4fce1c03f105@enterprisedb.com
/messages/by-id/849ee491-bba3-c0ae-cc25-4fce1c03f105@enterprisedb.com
[2]: /messages/by-id/7106a0fc-8017-c0fe-a407-9466c9407ff8@2ndquadrant.com
/messages/by-id/7106a0fc-8017-c0fe-a407-9466c9407ff8@2ndquadrant.com
[3]: /messages/by-id/CAHut+PukNh_HsN1Au1p9YhG5KCOr3dH5jnwm=RmeX75BOtXTEg@mail.gmail.com
/messages/by-id/CAHut+PukNh_HsN1Au1p9YhG5KCOr3dH5jnwm=RmeX75BOtXTEg@mail.gmail.com
[4]: /messages/by-id/CAFPTHDb7bpkuc4SxaL9B5vEvF2aEi0EOERdrG+xgVeAyMJsF=Q@mail.gmail.com
/messages/by-id/CAFPTHDb7bpkuc4SxaL9B5vEvF2aEi0EOERdrG+xgVeAyMJsF=Q@mail.gmail.com
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
6) parse_oper.c
I'm having some second thoughts about (not) allowing UDFs ...
Yes, I get that if the function starts failing, e.g. because querying a
dropped table or something, that breaks the replication and can't be
fixed without a resync.
The other problem is that users can access/query any table inside the
function and that also won't work in a logical decoding environment as
we use historic snapshots using which we can access only catalog
tables.
That's pretty annoying, but maybe disallowing anything user-defined
(functions and operators) is maybe overly anxious? Also, extensibility
is one of the hallmarks of Postgres, and disallowing all custom UDF and
operators seems to contradict that ...Perhaps just explaining that the expression can / can't do in the docs,
with clear warnings of the risks, would be acceptable.
I think the right way to support functions is by the explicit marking
of functions and in one of the emails above Jeff Davis also agreed
with the same. I think we should probably introduce a new marking for
this. I feel this is important because without this it won't be safe
to access even some of the built-in functions that can access/update
database (non-immutable functions) due to logical decoding environment
restrictions.
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.
Yeah, but that would have an additional load on write operations and I
am not sure at this stage but maybe there could be other ways to
extend the current infrastructure wherein we build the snapshots using
which we can access the user tables instead of only catalog tables.
Such enhancements if feasible would be useful not only for allowing
additional column access in row filters but for other purposes like
allowing access to functions that access user tables. I feel we can
extend this later as well seeing the usage and requests. For the first
version, this doesn't sound too limiting to me.
--
With Regards,
Amit Kapila.
On Fri, Sep 24, 2021 at 10:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.Yeah, but that would have an additional load on write operations and I
am not sure at this stage but maybe there could be other ways to
extend the current infrastructure wherein we build the snapshots using
which we can access the user tables instead of only catalog tables.
Such enhancements if feasible would be useful not only for allowing
additional column access in row filters but for other purposes like
allowing access to functions that access user tables. I feel we can
extend this later as well seeing the usage and requests. For the first
version, this doesn't sound too limiting to me.
I agree with one point from Tomas, that if we bind the row filter with
the RI, then if the user has to use the row filter on any column 1)
they have to add an unnecessary column to the index 2) Since they have
to add it to RI so now we will have to send it over the network as
well. 3). We anyway have to WAL log it if it is modified because now
we forced users to add some columns to RI because they wanted to use
the row filter on that. Now suppose we remove that limitation and we
somehow make these changes orthogonal to RI, i.e. if we have a row
filter on some column then we WAL log it, so now the only extra cost
we are paying is to just WAL log that column, but the user is not
forced to add it to index, not forced to send it over the network.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
13) turning update into insert
I agree with Ajin Cherian [4] that looking at just old or new row for
updates is not the right solution, because each option will "break" the
replica in some case. So I think the goal "keeping the replica in sync"
is the right perspective, and converting the update to insert/delete if
needed seems appropriate.This seems a somewhat similar to what pglogical does, because that may
also convert updates (although only to inserts, IIRC) when handling
replication conflicts. The difference is pglogical does all this on the
subscriber, while this makes the decision on the publisher.I wonder if this might have some negative consequences, or whether
"moving" this to downstream would be useful for other purposes in the
fuure (e.g. it might be reused for handling other conflicts).
Apart from additional traffic, I am not sure how will we handle all
the conditions on subscribers, say if the new row doesn't match, how
will subscribers know about this unless we pass row_filter or some
additional information along with tuple. Previously, I have done some
research and shared in one of the emails above that IBM's InfoSphere
Data Replication [1]https://www.ibm.com/docs/en/idr/11.4.0?topic=rows-search-conditions performs filtering in this way which also
suggests that we won't be off here.
15) pgoutput_row_filter initializing filter
I'm not sure I understand why the filter initialization gets moved from
get_rel_sync_entry. Presumably, most of what the replication does is
replicating rows, so I see little point in not initializing this along
with the rest of the rel_sync_entry.
Sorry, IIRC, this has been suggested by me and I thought it was best
to do any expensive computation the first time it is required. I have
shared few cases like in [2]/messages/by-id/CAA4eK1JBHo2U2sZemFdJmcwEinByiJVii8wzGCDVMxOLYB3CUw@mail.gmail.com where it would lead to additional cost
without any gain. Unless I am missing something, I don't see any
downside of doing it in a delayed fashion.
[1]: https://www.ibm.com/docs/en/idr/11.4.0?topic=rows-search-conditions
[2]: /messages/by-id/CAA4eK1JBHo2U2sZemFdJmcwEinByiJVii8wzGCDVMxOLYB3CUw@mail.gmail.com
--
With Regards,
Amit Kapila.
On Fri, Sep 24, 2021 at 11:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Sep 24, 2021 at 10:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.Yeah, but that would have an additional load on write operations and I
am not sure at this stage but maybe there could be other ways to
extend the current infrastructure wherein we build the snapshots using
which we can access the user tables instead of only catalog tables.
Such enhancements if feasible would be useful not only for allowing
additional column access in row filters but for other purposes like
allowing access to functions that access user tables. I feel we can
extend this later as well seeing the usage and requests. For the first
version, this doesn't sound too limiting to me.I agree with one point from Tomas, that if we bind the row filter with
the RI, then if the user has to use the row filter on any column 1)
they have to add an unnecessary column to the index 2) Since they have
to add it to RI so now we will have to send it over the network as
well. 3). We anyway have to WAL log it if it is modified because now
we forced users to add some columns to RI because they wanted to use
the row filter on that. Now suppose we remove that limitation and we
somehow make these changes orthogonal to RI, i.e. if we have a row
filter on some column then we WAL log it, so now the only extra cost
we are paying is to just WAL log that column, but the user is not
forced to add it to index, not forced to send it over the network.
I am not suggesting adding additional columns to RI just for using
filter expressions. If most users that intend to publish delete/update
wanted to use filter conditions apart from replica identity then we
can later extend this functionality but not sure if the only way to
accomplish that is to log additional data in WAL. I am just trying to
see if we can provide meaningful functionality without extending too
much the scope of this work.
--
With Regards,
Amit Kapila.
On Fri, Sep 24, 2021 at 11:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Sep 24, 2021 at 11:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Sep 24, 2021 at 10:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.Yeah, but that would have an additional load on write operations and I
am not sure at this stage but maybe there could be other ways to
extend the current infrastructure wherein we build the snapshots using
which we can access the user tables instead of only catalog tables.
Such enhancements if feasible would be useful not only for allowing
additional column access in row filters but for other purposes like
allowing access to functions that access user tables. I feel we can
extend this later as well seeing the usage and requests. For the first
version, this doesn't sound too limiting to me.I agree with one point from Tomas, that if we bind the row filter with
the RI, then if the user has to use the row filter on any column 1)
they have to add an unnecessary column to the index 2) Since they have
to add it to RI so now we will have to send it over the network as
well. 3). We anyway have to WAL log it if it is modified because now
we forced users to add some columns to RI because they wanted to use
the row filter on that. Now suppose we remove that limitation and we
somehow make these changes orthogonal to RI, i.e. if we have a row
filter on some column then we WAL log it, so now the only extra cost
we are paying is to just WAL log that column, but the user is not
forced to add it to index, not forced to send it over the network.I am not suggesting adding additional columns to RI just for using
filter expressions. If most users that intend to publish delete/update
wanted to use filter conditions apart from replica identity then we
can later extend this functionality but not sure if the only way to
accomplish that is to log additional data in WAL.
One possibility in this regard could be that we enhance Replica
Identity .. Include (column_list) where all the columns in the include
list won't be sent but I think it is better to postpone such
enhancements for a later version. Like, I suggested above, we might
want to extend our infrastructure in a way where not only this extra
columns request can be accomplished but we should be able to allow
UDF's (where user tables can be accessed) and probably sub-queries as
well.
--
With Regards,
Amit Kapila.
On Fri, Sep 24, 2021 at 12:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
One possibility in this regard could be that we enhance Replica
Identity .. Include (column_list) where all the columns in the include
list won't be sent
Instead of RI's include column list why we can not think of
row_filter's columns list? I mean like we log the old RI column can't
we make similar things for the row filter columns? With that, we
don't have to all the columns instead we only log the columns which
are in row filter, or is this too hard to identify during write
operation? So now the WAL logging requirement for RI and row filter
is orthogonal and if some columns are common then we can log only
once?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Sep 24, 2021 at 12:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Sep 24, 2021 at 12:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
One possibility in this regard could be that we enhance Replica
Identity .. Include (column_list) where all the columns in the include
list won't be sentInstead of RI's include column list why we can not think of
row_filter's columns list? I mean like we log the old RI column can't
we make similar things for the row filter columns? With that, we
don't have to all the columns instead we only log the columns which
are in row filter, or is this too hard to identify during write
operation?
Yeah, we can do that as well but my guess is that will have some
additional work (to find common columns and log them only once) in
heap_delete/update and then probably during decoding (to assemble the
required filter and RI key). I am not very sure on this point, one has
to write code and test.
--
With Regards,
Amit Kapila.
On 9/24/21 8:09 AM, Amit Kapila wrote:
On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:13) turning update into insert
I agree with Ajin Cherian [4] that looking at just old or new row for
updates is not the right solution, because each option will "break" the
replica in some case. So I think the goal "keeping the replica in sync"
is the right perspective, and converting the update to insert/delete if
needed seems appropriate.This seems a somewhat similar to what pglogical does, because that may
also convert updates (although only to inserts, IIRC) when handling
replication conflicts. The difference is pglogical does all this on the
subscriber, while this makes the decision on the publisher.I wonder if this might have some negative consequences, or whether
"moving" this to downstream would be useful for other purposes in the
fuure (e.g. it might be reused for handling other conflicts).Apart from additional traffic, I am not sure how will we handle all
the conditions on subscribers, say if the new row doesn't match, how
will subscribers know about this unless we pass row_filter or some
additional information along with tuple. Previously, I have done some
research and shared in one of the emails above that IBM's InfoSphere
Data Replication [1] performs filtering in this way which also
suggests that we won't be off here.
I'm certainly not suggesting what we're doing is wrong. Given the design
of built-in logical replication it makes sense doing it this way, I was
just thinking aloud about what we might want to do in the future (e.g.
pglogical uses this to deal with conflicts between multiple sources, and
so on).
15) pgoutput_row_filter initializing filter
I'm not sure I understand why the filter initialization gets moved from
get_rel_sync_entry. Presumably, most of what the replication does is
replicating rows, so I see little point in not initializing this along
with the rest of the rel_sync_entry.Sorry, IIRC, this has been suggested by me and I thought it was best
to do any expensive computation the first time it is required. I have
shared few cases like in [2] where it would lead to additional cost
without any gain. Unless I am missing something, I don't see any
downside of doing it in a delayed fashion.
Not sure, but the arguments presented there seem a bit wonky ...
Yes, the work would be wasted if we discard the cached data without
using it (it might happen for truncate, I'm not sure). But how likely is
it that such operations happen *in isolation*? I'd bet the workload is
almost never just a stream of truncates - there are always some
operations in between that would actually use this.
Similarly for the errors - IIRC hitting an error means the replication
restarts, which is orders of magnitude more expensive than anything we
can save by this delayed evaluation.
I'd keep it simple, for the sake of simplicity of the whole patch.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 9/24/21 7:20 AM, Amit Kapila wrote:
On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:6) parse_oper.c
I'm having some second thoughts about (not) allowing UDFs ...
Yes, I get that if the function starts failing, e.g. because querying a
dropped table or something, that breaks the replication and can't be
fixed without a resync.The other problem is that users can access/query any table inside the
function and that also won't work in a logical decoding environment as
we use historic snapshots using which we can access only catalog
tables.
True. I always forget about some of these annoying issues. Let's
document all of this in some comment / README. I see we still don't have
src/backend/replication/logical/README
which is a bit surprising, considering how complex this code is.
That's pretty annoying, but maybe disallowing anything user-defined
(functions and operators) is maybe overly anxious? Also, extensibility
is one of the hallmarks of Postgres, and disallowing all custom UDF and
operators seems to contradict that ...Perhaps just explaining that the expression can / can't do in the docs,
with clear warnings of the risks, would be acceptable.I think the right way to support functions is by the explicit marking
of functions and in one of the emails above Jeff Davis also agreed
with the same. I think we should probably introduce a new marking for
this. I feel this is important because without this it won't be safe
to access even some of the built-in functions that can access/update
database (non-immutable functions) due to logical decoding environment
restrictions.
I agree that seems reasonable. Is there any reason why not to just use
IMMUTABLE for this purpose? Seems like a good match to me.
Yes, the user can lie and label something that is not really IMMUTABLE,
but that's his fault. Yes, it's harder to fix than e.g. for indexes.
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.Yeah, but that would have an additional load on write operations and I
am not sure at this stage but maybe there could be other ways to
extend the current infrastructure wherein we build the snapshots using
which we can access the user tables instead of only catalog tables.
Such enhancements if feasible would be useful not only for allowing
additional column access in row filters but for other purposes like
allowing access to functions that access user tables. I feel we can
extend this later as well seeing the usage and requests. For the first
version, this doesn't sound too limiting to me.
I'm not really buying the argument that this means overhead for write
operations. Well, it does, but the current RI approach is forcing users
to either use RIF or add an index covering the filter attributes.
Neither of those options is free, and I'd bet the extra overhead of
adding just the row filter columns would be actually lower.
If the argument is merely to limit the scope of this patch, fine. But
I'd bet the amount of code we'd have to add to ExtractReplicaIdentity
(or maybe somewhere close to it) would be fairly small. We'd need to
cache which columns are needed (like RelationGetIndexAttrBitmap), and
this might be a bit more complex, due to having to consider all the
publications etc.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, Sep 25, 2021 at 3:30 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 9/24/21 7:20 AM, Amit Kapila wrote:
I think the right way to support functions is by the explicit marking
of functions and in one of the emails above Jeff Davis also agreed
with the same. I think we should probably introduce a new marking for
this. I feel this is important because without this it won't be safe
to access even some of the built-in functions that can access/update
database (non-immutable functions) due to logical decoding environment
restrictions.I agree that seems reasonable. Is there any reason why not to just use
IMMUTABLE for this purpose? Seems like a good match to me.
It will just solve one part of the puzzle (related to database access)
but it is better to avoid the risk of broken replication by explicit
marking especially for UDFs or other user-defined objects. You seem to
be okay documenting such risk but I am not sure we have an agreement
on that especially because that was one of the key points of
discussions in this thread and various people told that we need to do
something about it. I personally feel we should do something if we
want to allow user-defined functions or operators because as reported
in the thread this problem has been reported multiple times. I think
we can go ahead with IMMUTABLE built-ins for the first version and
then allow UDFs later or let's try to find a way for explicit marking.
Yes, the user can lie and label something that is not really IMMUTABLE,
but that's his fault. Yes, it's harder to fix than e.g. for indexes.
Agreed and I think we can't do anything about this.
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.Yeah, but that would have an additional load on write operations and I
am not sure at this stage but maybe there could be other ways to
extend the current infrastructure wherein we build the snapshots using
which we can access the user tables instead of only catalog tables.
Such enhancements if feasible would be useful not only for allowing
additional column access in row filters but for other purposes like
allowing access to functions that access user tables. I feel we can
extend this later as well seeing the usage and requests. For the first
version, this doesn't sound too limiting to me.I'm not really buying the argument that this means overhead for write
operations. Well, it does, but the current RI approach is forcing users
to either use RIF or add an index covering the filter attributes.
Neither of those options is free, and I'd bet the extra overhead of
adding just the row filter columns would be actually lower.If the argument is merely to limit the scope of this patch, fine.
Yeah, that is one and I am not sure that adding extra WAL is the best
or only solution for this problem. As mentioned in my previous
response, I think we eventually need to find a way to access user
tables to support UDFs (that access database) or sub-query which other
databases already support, and for that, we might need to enhance the
current snapshot mechanism after which we might not need any
additional WAL even for additional columns in row filter. I don't
think anyone of us has evaluated in detail the different ways this
problem can be solved and the pros/cons of each approach, so limiting
the scope for this purpose doesn't seem like a bad idea to me.
--
With Regards,
Amit Kapila.
On Sat, Sep 25, 2021 at 3:07 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 9/24/21 8:09 AM, Amit Kapila wrote:
On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:13) turning update into insert
I agree with Ajin Cherian [4] that looking at just old or new row for
updates is not the right solution, because each option will "break" the
replica in some case. So I think the goal "keeping the replica in sync"
is the right perspective, and converting the update to insert/delete if
needed seems appropriate.This seems a somewhat similar to what pglogical does, because that may
also convert updates (although only to inserts, IIRC) when handling
replication conflicts. The difference is pglogical does all this on the
subscriber, while this makes the decision on the publisher.I wonder if this might have some negative consequences, or whether
"moving" this to downstream would be useful for other purposes in the
fuure (e.g. it might be reused for handling other conflicts).Apart from additional traffic, I am not sure how will we handle all
the conditions on subscribers, say if the new row doesn't match, how
will subscribers know about this unless we pass row_filter or some
additional information along with tuple. Previously, I have done some
research and shared in one of the emails above that IBM's InfoSphere
Data Replication [1] performs filtering in this way which also
suggests that we won't be off here.I'm certainly not suggesting what we're doing is wrong. Given the design
of built-in logical replication it makes sense doing it this way, I was
just thinking aloud about what we might want to do in the future (e.g.
pglogical uses this to deal with conflicts between multiple sources, and
so on).
Fair enough.
15) pgoutput_row_filter initializing filter
I'm not sure I understand why the filter initialization gets moved from
get_rel_sync_entry. Presumably, most of what the replication does is
replicating rows, so I see little point in not initializing this along
with the rest of the rel_sync_entry.Sorry, IIRC, this has been suggested by me and I thought it was best
to do any expensive computation the first time it is required. I have
shared few cases like in [2] where it would lead to additional cost
without any gain. Unless I am missing something, I don't see any
downside of doing it in a delayed fashion.Not sure, but the arguments presented there seem a bit wonky ...
Yes, the work would be wasted if we discard the cached data without
using it (it might happen for truncate, I'm not sure). But how likely is
it that such operations happen *in isolation*? I'd bet the workload is
almost never just a stream of truncates - there are always some
operations in between that would actually use this.
It could also happen with a mix of truncate and other operations as we
decide whether to publish an operation or not after
get_rel_sync_entry.
Similarly for the errors - IIRC hitting an error means the replication
restarts, which is orders of magnitude more expensive than anything we
can save by this delayed evaluation.I'd keep it simple, for the sake of simplicity of the whole patch.
The current version proposed by Peter is not reviewed yet and by
looking at it I have some questions too which I'll clarify in a
separate email. I am not sure if you are against delaying the
expression initialization because of the current code or concept as a
general because if it is later then we have other instances as well
when we don't do all the work in get_rel_sync_entry like building
tuple conversion map which is cached as well.
--
With Regards,
Amit Kapila.
On Mon, Sep 20, 2021 at 3:17 PM Ajin Cherian <itsajin@gmail.com> wrote:
I have not changed any of the first 5 patches, just added my patch 006
at the end. Do let me know of any comments on this approach.
I have a question regarding v29-0003-PS-ExprState-cache-modifications.
In pgoutput_row_filter, for row_filter, we are traversing ancestors of
a partition to find pub_relid but isn't that already available in
RelationSyncEntry as publish_as_relid?
--
With Regards,
Amit Kapila.
On 9/25/21 6:23 AM, Amit Kapila wrote:
On Sat, Sep 25, 2021 at 3:30 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 9/24/21 7:20 AM, Amit Kapila wrote:
I think the right way to support functions is by the explicit marking
of functions and in one of the emails above Jeff Davis also agreed
with the same. I think we should probably introduce a new marking for
this. I feel this is important because without this it won't be safe
to access even some of the built-in functions that can access/update
database (non-immutable functions) due to logical decoding environment
restrictions.I agree that seems reasonable. Is there any reason why not to just use
IMMUTABLE for this purpose? Seems like a good match to me.It will just solve one part of the puzzle (related to database access)
but it is better to avoid the risk of broken replication by explicit
marking especially for UDFs or other user-defined objects. You seem to
be okay documenting such risk but I am not sure we have an agreement
on that especially because that was one of the key points of
discussions in this thread and various people told that we need to do
something about it. I personally feel we should do something if we
want to allow user-defined functions or operators because as reported
in the thread this problem has been reported multiple times. I think
we can go ahead with IMMUTABLE built-ins for the first version and
then allow UDFs later or let's try to find a way for explicit marking.
Well, I know multiple people mentioned that issue. And I certainly agree
just documenting the risk would not be an ideal solution. Requiring the
functions to be labeled helps, but we've seen people marking volatile
functions as immutable in order to allow indexing, so we'll have to
document the risks anyway.
All I'm saying is that allowing built-in functions/operators but not
user-defined variants seems like an annoying break of extensibility.
People are used that user-defined stuff can be used just like built-in
functions and operators.
Yes, the user can lie and label something that is not really IMMUTABLE,
but that's his fault. Yes, it's harder to fix than e.g. for indexes.Agreed and I think we can't do anything about this.
12) misuse of REPLICA IDENTITY
The more I think about this, the more I think we're actually misusing
REPLICA IDENTITY for something entirely different. The whole purpose of
RI was to provide a row identifier for the subscriber.But now we're using it to ensure we have all the necessary columns,
which is entirely orthogonal to the original purpose. I predict this
will have rather negative consequences.People will either switch everything to REPLICA IDENTITY FULL, or create
bogus unique indexes with extra columns. Which is really silly, because
it wastes network bandwidth (transfers more data) or local resources
(CPU and disk space to maintain extra indexes).IMHO this needs more infrastructure to request extra columns to decode
(e.g. for the filter expression), and then remove them before sending
the data to the subscriber.Yeah, but that would have an additional load on write operations and I
am not sure at this stage but maybe there could be other ways to
extend the current infrastructure wherein we build the snapshots using
which we can access the user tables instead of only catalog tables.
Such enhancements if feasible would be useful not only for allowing
additional column access in row filters but for other purposes like
allowing access to functions that access user tables. I feel we can
extend this later as well seeing the usage and requests. For the first
version, this doesn't sound too limiting to me.I'm not really buying the argument that this means overhead for write
operations. Well, it does, but the current RI approach is forcing users
to either use RIF or add an index covering the filter attributes.
Neither of those options is free, and I'd bet the extra overhead of
adding just the row filter columns would be actually lower.If the argument is merely to limit the scope of this patch, fine.
Yeah, that is one and I am not sure that adding extra WAL is the best
or only solution for this problem. As mentioned in my previous
response, I think we eventually need to find a way to access user
tables to support UDFs (that access database) or sub-query which other
databases already support, and for that, we might need to enhance the
current snapshot mechanism after which we might not need any
additional WAL even for additional columns in row filter. I don't
think anyone of us has evaluated in detail the different ways this
problem can be solved and the pros/cons of each approach, so limiting
the scope for this purpose doesn't seem like a bad idea to me.
Understood. I don't have a very good idea which of those options is the
best one either, although I think enhancing the snapshot mechanism would
be rather tricky.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, Sep 25, 2021 at 3:36 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 9/25/21 6:23 AM, Amit Kapila wrote:
On Sat, Sep 25, 2021 at 3:30 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 9/24/21 7:20 AM, Amit Kapila wrote:
I think the right way to support functions is by the explicit marking
of functions and in one of the emails above Jeff Davis also agreed
with the same. I think we should probably introduce a new marking for
this. I feel this is important because without this it won't be safe
to access even some of the built-in functions that can access/update
database (non-immutable functions) due to logical decoding environment
restrictions.I agree that seems reasonable. Is there any reason why not to just use
IMMUTABLE for this purpose? Seems like a good match to me.It will just solve one part of the puzzle (related to database access)
but it is better to avoid the risk of broken replication by explicit
marking especially for UDFs or other user-defined objects. You seem to
be okay documenting such risk but I am not sure we have an agreement
on that especially because that was one of the key points of
discussions in this thread and various people told that we need to do
something about it. I personally feel we should do something if we
want to allow user-defined functions or operators because as reported
in the thread this problem has been reported multiple times. I think
we can go ahead with IMMUTABLE built-ins for the first version and
then allow UDFs later or let's try to find a way for explicit marking.Well, I know multiple people mentioned that issue. And I certainly agree
just documenting the risk would not be an ideal solution. Requiring the
functions to be labeled helps, but we've seen people marking volatile
functions as immutable in order to allow indexing, so we'll have to
document the risks anyway.All I'm saying is that allowing built-in functions/operators but not
user-defined variants seems like an annoying break of extensibility.
People are used that user-defined stuff can be used just like built-in
functions and operators.
I agree with you that allowing UDFs in some way would be good for this
feature. I think once we get the base feature committed then we can
discuss whether and how to allow UDFs. Do we want to have an
additional label for it or can we come up with something which allows
the user to continue replication even if she has dropped the object
used in the function? It seems like we can limit the scope of base
patch functionality to allow the use of immutable built-in functions
in row filter expressions.
--
With Regards,
Amit Kapila.
Hi,
I see no one responded to this important part of my review so far:
On 9/23/21 2:33 PM, Tomas Vondra wrote:
3) create_subscription.sgml
<literal>WHERE</literal> clauses, rows must satisfy all expressions
to be copied. If the subscriber is aI'm rather skeptical about the principle that all expressions have to
match - I'd have expected exactly the opposite behavior, actually.I see a subscription as "a union of all publications". Imagine for
example you have a data set for all customers, and you create a
publication for different parts of the world, likeCREATE PUBLICATION customers_france
FOR TABLE customers WHERE (country = 'France');CREATE PUBLICATION customers_germany
FOR TABLE customers WHERE (country = 'Germany');CREATE PUBLICATION customers_usa
FOR TABLE customers WHERE (country = 'USA');and now you want to subscribe to multiple publications, because you want
to replicate data for multiple countries (e.g. you want EU countries).
But if you doCREATE SUBSCRIPTION customers_eu
PUBLICATION customers_france, customers_germany;then you won't get anything, because each customer belongs to just a
single country. Yes, I could create multiple individual subscriptions,
one for each country, but that's inefficient and may have a different
set of issues (e.g. keeping them in sync when a customer moves between
countries).I might have missed something, but I haven't found any explanation why
the requirement to satisfy all expressions is the right choice.IMHO this should be 'satisfies at least one expression' i.e. we should
connect the expressions by OR, not AND.
Am I the only one finding the current behavior strange? What's the
reasoning supporting the current approach?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Sep 27, 2021, at 10:34 AM, Tomas Vondra wrote:
Hi,
I see no one responded to this important part of my review so far:
I'm still preparing a new patch and a summary.
Am I the only one finding the current behavior strange? What's the
reasoning supporting the current approach?
I think it is an oversight from my side. It used to work the way you mentioned
but I changed it. I'll include this change in the next patch.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Sep 27, 2021 at 7:19 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Sep 27, 2021, at 10:34 AM, Tomas Vondra wrote:
Hi,
I see no one responded to this important part of my review so far:
I'm still preparing a new patch and a summary.
Am I the only one finding the current behavior strange? What's the
reasoning supporting the current approach?I think it is an oversight from my side. It used to work the way you mentioned
but I changed it. I'll include this change in the next patch.
+1.
--
With Regards,
Amit Kapila.
On Wed, Sep 22, 2021 at 2:05 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Wed, Sep 22, 2021 at 1:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Sep 22, 2021 at 6:42 AM Ajin Cherian <itsajin@gmail.com> wrote:
Why do you think that the second assumption (if there is an old tuple
it will contain all RI key fields.) is broken? It seems to me even
when we are planning to include unchanged toast as part of old_key, it
will contain all the key columns, isn't that true?Yes, I assumed wrongly. Just checked. What you say is correct.
I think we
still need to deform both old tuple and new tuple, just to handle this case.Yeah, but we will anyway talking about saving that cost for later if
we decide to send that tuple. I think we can further try to optimize
it by first checking whether the new tuple has any toasted value, if
so then only we need this extra pass of deforming.Ok, I will go ahead with this approach.
There is currently logic in ReorderBufferToastReplace() which already
deforms the new tuple
to detoast changed toasted fields in the new tuple. I think if we can
enhance this logic for our
purpose, then we can avoid an extra deform of the new tuple.
But I think you had earlier indicated that having untoasted unchanged
values in the new tuple
can be bothersome.I think it will be too costly on the subscriber side during apply
because it will update all the unchanged toasted values which will
lead to extra writes both for WAL and data.
Based on the discussion above, I've added two more slot pointers in
the RelationSyncEntry structure to store tuples that have been
deformed. Once the tuple (old and new) is deformed , then it is stored
in the structure, where it can be retrieved while writing to the
stream.I have also changed the logic so that the old tuple is not
populated, as Dilip pointed out, it will have all the RI columns if it
is changed.
I've added two new APIs in proto.c for writing tuple cached and
writing update cached. These are called if the the slots
contain previously deformed tuples.
I have for now also rebased the patch and merged the first 5 patches
into 1, and added my changes for the above into the second patch.
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v30-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v30-0001-Row-filter-for-logical-replication.patchDownload
From 946ba28d0d3d95d2d8c15066ff86b514930cb7a9 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Sat, 2 Oct 2021 01:25:16 -0400
Subject: [PATCH v30] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/dependency.c | 126 +++++++++++
src/backend/catalog/pg_publication.c | 133 ++++++++++-
src/backend/commands/publicationcmds.c | 104 +++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 13 ++
src/backend/parser/parse_expr.c | 25 ++-
src/backend/parser/parse_func.c | 6 +
src/backend/parser/parse_oper.c | 9 +
src/backend/replication/logical/tablesync.c | 95 +++++++-
src/backend/replication/pgoutput/pgoutput.c | 329 +++++++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/bin/psql/tab-complete.c | 10 +-
src/include/catalog/dependency.h | 9 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 164 ++++++++++++++
src/test/regress/sql/publication.sql | 112 ++++++++++
src/test/subscription/t/025_row_filter.pl | 299 +++++++++++++++++++++++++
28 files changed, 1478 insertions(+), 81 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 384e6ea..d117652 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6231,6 +6231,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..405b3cd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,126 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Nothing more complicated is permitted. Specifically, no functions of any kind
+ * and no user-defined operators.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ forbidden = _("function calls are not allowed");
+ }
+ else
+ {
+ elog(DEBUG1, "row filter contained something unexpected: %s", nodeToString(node));
+ forbidden = _("too complex");
+ }
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden),
+ errhint("only simple expressions using columns and constants are allowed")
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9cd0c82..7b94192 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -138,6 +141,86 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
+ * Walk the parse-tree to decide if the row-filter is valid or not.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * Walk the parse-tree and reject anything more complicated than a very
+ * simple expression.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
+
+/*
* Gets the relations based on the publication partition option for a specified
* relation.
*/
@@ -178,21 +261,26 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
/*
* Check for duplicates. Note that this does not really prevent
@@ -210,10 +298,33 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -227,6 +338,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -243,6 +360,14 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 9c7f916..747f388 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -391,38 +391,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -552,9 +538,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -562,22 +549,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -594,7 +605,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -629,7 +645,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -656,6 +678,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -676,7 +700,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -705,11 +729,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -719,7 +741,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 228387e..a69e131 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4964,6 +4964,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 800f588..7a33695 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3137,6 +3137,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e3068a3..9765aeb 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,14 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+#endif
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +951,11 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
+ err = _("window functions are not allowed in publication WHERE expressions");
+#endif
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,21 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+#if 0
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+#endif
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +517,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1778,11 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
+ err = _("cannot use subquery in publication WHERE expression");
+#endif
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3103,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,12 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,15 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+#endif
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..ce5e1c5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +125,16 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means that exprstate_list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstate_list; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +156,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +165,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +647,250 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache then in the
+ * entry->exprstate_list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate_list)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +917,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +941,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +948,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +981,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1015,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1084,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1403,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1136,8 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1230,9 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1339,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate_list != NIL)
+ {
+ list_free_deep(entry->exprstate_list);
+ entry->exprstate_list = NIL;
+ }
}
}
@@ -1350,6 +1656,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1666,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1377,6 +1686,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a485fb2..0f4892c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,6 +4141,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4151,9 +4152,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4162,6 +4170,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4202,6 +4211,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4234,8 +4247,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index a33d77c..83249e8 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5cd5838..8686ec6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..ec1cb75 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -151,6 +150,14 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
+extern void rowfilter_validator(char *relname, Node *expr);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 82f2536..e5df91e 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -116,7 +118,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877..1dfdaf3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3640,6 +3640,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3647,7 +3648,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3660,7 +3661,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 82bce9b..0443823 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -167,6 +167,170 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+-- Test row filter for publications
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl4"
+DETAIL: function calls are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: too complex
+HINT: only simple expressions using columns and constants are allowed
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index e5745d5..209eab7 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -100,6 +100,118 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+-- Test row filter for publications
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..dc9becc
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,299 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v30-0002-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v30-0002-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From dc3221095be3b55c8929d993a16750f57d2e5e62 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Sat, 2 Oct 2021 02:46:03 -0400
Subject: [PATCH v30] Support updates based on old and new tuple in row filters
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
---
src/backend/replication/logical/proto.c | 122 ++++++++++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 168 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
5 files changed, 285 insertions(+), 19 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce5e1c5..bdd61f9 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -133,6 +133,8 @@ typedef struct RelationSyncEntry
bool rowfilter_valid;
List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* tuple table slot for storing deformed new tuple */
+ TupleTableSlot *old_tuple; /* tuple table slot for storing deformed old tuple */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +169,14 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +740,107 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched, newtup_changed = false;
+ HeapTuple tmpnewtuple;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+ new_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+ tmp_new_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+ entry->old_tuple = old_slot;
+ entry->new_tuple = new_slot;
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, entry->new_tuple);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ newtup_changed = true;
+ }
+
+ }
+
+ if (newtup_changed)
+ tmpnewtuple = heap_form_tuple(desc, tmp_new_slot->tts_values, new_slot->tts_isnull);
+
+ old_matched = pgoutput_row_filter(relation, NULL, oldtuple, entry);
+ new_matched = pgoutput_row_filter(relation, NULL,
+ newtup_changed ? tmpnewtuple : newtuple, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -769,9 +864,13 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = NULL;
+ entry->new_tuple = NULL;
+
MemoryContextSwitchTo(oldctx);
/*
@@ -846,6 +945,21 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstate_list == NIL)
@@ -941,6 +1055,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1066,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1097,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,8 +1124,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->old_tuple && relentry->new_tuple)
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1016,7 +1156,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dc9becc..742bbbe 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Sat, Oct 2, 2021 at 5:44 PM Ajin Cherian <itsajin@gmail.com> wrote:
I have for now also rebased the patch and merged the first 5 patches
into 1, and added my changes for the above into the second patch.
I have split the patches back again, just to be consistent with the
original state of the patches. Sorry for the inconvenience.
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v31-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v31-0001-Row-filter-for-logical-replication.patchDownload
From 8592f65490988b888934b7728f5d63bd4d100b42 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 00:24:04 -0400
Subject: [PATCH v31] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 997 insertions(+), 78 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index fd6910d..43bc11f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9cd0c82..1d0f77d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -178,22 +181,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -210,10 +219,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -227,6 +256,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -243,6 +278,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 9c7f916..747f388 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -391,38 +391,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -552,9 +538,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -562,22 +549,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -594,7 +605,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -629,7 +645,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -656,6 +678,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -676,7 +700,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -705,11 +729,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -719,7 +741,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 228387e..a69e131 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4964,6 +4964,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 800f588..7a33695 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3137,6 +3137,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08f1bf1..ceeb795 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a485fb2..0f4892c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,6 +4141,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4151,9 +4152,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4162,6 +4170,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4202,6 +4211,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4234,8 +4247,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index a33d77c..83249e8 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 82f2536..e5df91e 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -116,7 +118,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877..1dfdaf3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3640,6 +3640,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3647,7 +3648,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3660,7 +3661,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 82bce9b..a75ab3c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -167,6 +167,77 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index e5745d5..5b22d51 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -100,6 +100,38 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v31-0005-PS-POC-Row-filter-validation-walker.patchapplication/octet-stream; name=v31-0005-PS-POC-Row-filter-validation-walker.patchDownload
From 8a73d9854b69980f3e64b4b14b3bfc6792424672 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 04:31:05 -0400
Subject: [PATCH v31] PS - POC Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filer expression are permitted. Specifially:
- no user-defined operators.
- no functions.
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr
This POC patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to the modified validation error messages.
---
src/backend/catalog/dependency.c | 68 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 14 +++++--
src/backend/parser/parse_agg.c | 5 ++-
src/backend/parser/parse_expr.c | 6 ++-
src/backend/parser/parse_func.c | 3 ++
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 17 +++++---
src/test/regress/sql/publication.sql | 2 +
9 files changed, 107 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index e81f093..405b3cd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,68 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Nothing more complicated is permitted. Specifically, no functions of any kind
+ * and no user-defined operators.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ forbidden = _("function calls are not allowed");
+ }
+ else
+ {
+ elog(DEBUG1, "row filter contained something unexpected: %s", nodeToString(node));
+ forbidden = _("too complex");
+ }
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden),
+ errhint("only simple expressions using columns and constants are allowed")
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return what
* is found as a list of RfCol. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index eac7449..e49b3ca 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -144,7 +144,7 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Walk the parse-tree to decide if the row-filter is valid or not.
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
@@ -152,6 +152,14 @@ rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
/*
* Rule:
*
+ * Walk the parse-tree and reject anything more complicated than a very
+ * simple expression.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule:
+ *
* If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in
* the row-filter WHERE clause.
@@ -305,13 +313,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2c7310e..dd69aff 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -156,6 +155,7 @@ typedef struct RfCol {
int attnum;
} RfCol;
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e7c8c19..e10adc8 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -218,16 +218,21 @@ Tables:
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl4"
+DETAIL: function calls are not allowed
+HINT: only simple expressions using columns and constants are allowed
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: too complex
+HINT: only simple expressions using columns and constants are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 6701d50..a30657b 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -123,6 +123,8 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
--
1.8.3.1
v31-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v31-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 2961710963d3cc69e3a6b4723528b1c96bc481ae Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 00:29:44 -0400
Subject: [PATCH v31] PS - Add tab auto-complete support for the Row Filter
WHERE.
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ecae9df..bd35d19 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
v31-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v31-0003-PS-ExprState-cache-modifications.patchDownload
From c7f02408816b253dbbadd01c40d571d3459c3cbb Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 00:33:05 -0400
Subject: [PATCH v31] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1220203..ce5e1c5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means that exprstate_list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache then in the
+ * entry->exprstate_list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstate_list == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstate_list)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1318,10 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1333,7 +1443,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1347,22 +1456,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1372,9 +1465,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1434,33 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1567,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate_list != NIL)
+ {
+ list_free_deep(entry->exprstate_list);
+ entry->exprstate_list = NIL;
+ }
}
}
@@ -1607,12 +1685,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v31-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v31-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 82e14c19dff9f17e2a511a84514f4a8d5a450a92 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 04:11:40 -0400
Subject: [PATCH v31] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 58 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 74 ++++++++++++++++++++++-
src/include/catalog/dependency.h | 6 ++
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 79 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 309 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..e81f093 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,64 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 1d0f77d..eac7449 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -141,9 +141,76 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Walk the parse-tree to decide if the row-filter is valid or not.
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY", colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -242,6 +309,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..2c7310e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,12 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index a75ab3c..e7c8c19 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -172,13 +172,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -188,7 +190,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -199,7 +201,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -210,7 +212,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -238,6 +240,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5b22d51..6701d50 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -105,7 +105,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -132,6 +134,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v31-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v31-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From ac0e5b0d91d9c119d0970185d38a227323075882 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 04:53:46 -0400
Subject: [PATCH v31] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
---
src/backend/replication/logical/proto.c | 122 ++++++++++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 168 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
5 files changed, 285 insertions(+), 19 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce5e1c5..bdd61f9 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -133,6 +133,8 @@ typedef struct RelationSyncEntry
bool rowfilter_valid;
List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* tuple table slot for storing deformed new tuple */
+ TupleTableSlot *old_tuple; /* tuple table slot for storing deformed old tuple */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +169,14 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +740,107 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched, newtup_changed = false;
+ HeapTuple tmpnewtuple;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+ new_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+ tmp_new_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+ entry->old_tuple = old_slot;
+ entry->new_tuple = new_slot;
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, entry->new_tuple);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ newtup_changed = true;
+ }
+
+ }
+
+ if (newtup_changed)
+ tmpnewtuple = heap_form_tuple(desc, tmp_new_slot->tts_values, new_slot->tts_isnull);
+
+ old_matched = pgoutput_row_filter(relation, NULL, oldtuple, entry);
+ new_matched = pgoutput_row_filter(relation, NULL,
+ newtup_changed ? tmpnewtuple : newtuple, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -769,9 +864,13 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = NULL;
+ entry->new_tuple = NULL;
+
MemoryContextSwitchTo(oldctx);
/*
@@ -846,6 +945,21 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstate_list == NIL)
@@ -941,6 +1055,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1066,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1097,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,8 +1124,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->old_tuple && relentry->new_tuple)
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1016,7 +1156,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dc9becc..742bbbe 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Wed, Oct 6, 2021 at 2:33 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Sat, Oct 2, 2021 at 5:44 PM Ajin Cherian <itsajin@gmail.com> wrote:
I have for now also rebased the patch and merged the first 5 patches
into 1, and added my changes for the above into the second patch.I have split the patches back again, just to be consistent with the
original state of the patches. Sorry for the inconvenience.
Thanks for the updated version of the patch, I was looking into the
latest version and I have a few comments.
+ if ((att->attlen == -1 &&
VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ newtup_changed = true;
+ }
If the attribute is stored EXTERNAL_ONDIS on the new tuple and it is
not null in the old tuple then it must be logged completely in the old
tuple, so instead of checking
!(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]), it should be
asserted,
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values,
new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values,
old_slot->tts_isnull);
+
+ if (newtup_changed)
+ tmpnewtuple = heap_form_tuple(desc, tmp_new_slot->tts_values,
new_slot->tts_isnull);
+
+ old_matched = pgoutput_row_filter(relation, NULL, oldtuple, entry);
+ new_matched = pgoutput_row_filter(relation, NULL,
+ newtup_changed ? tmpnewtuple :
newtuple, entry);
I do not like the fact that, first we have deformed the tuples and we
are again using the HeapTuple
for expression evaluation machinery and later the expression
evaluation we do the deform again.
So why don't you use the deformed tuple as it is to store as a virtual tuple?
Infact, if newtup_changed is true then you are forming back the tuple
just to get it deformed again
in the expression evaluation.
I think I have already given this comment on the last version.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Oct 12, 2021 at 1:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Oct 6, 2021 at 2:33 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Sat, Oct 2, 2021 at 5:44 PM Ajin Cherian <itsajin@gmail.com> wrote:
I have for now also rebased the patch and merged the first 5 patches
into 1, and added my changes for the above into the second patch.I have split the patches back again, just to be consistent with the
original state of the patches. Sorry for the inconvenience.Thanks for the updated version of the patch, I was looking into the
latest version and I have a few comments.+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) && + (!old_slot->tts_isnull[i] && + !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i])))) + { + tmp_new_slot->tts_values[i] = old_slot->tts_values[i]; + newtup_changed = true; + }If the attribute is stored EXTERNAL_ONDIS on the new tuple and it is
not null in the old tuple then it must be logged completely in the old
tuple, so instead of checking
!(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]), it should be
asserted,+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull); + heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull); + + if (newtup_changed) + tmpnewtuple = heap_form_tuple(desc, tmp_new_slot->tts_values, new_slot->tts_isnull); + + old_matched = pgoutput_row_filter(relation, NULL, oldtuple, entry); + new_matched = pgoutput_row_filter(relation, NULL, + newtup_changed ? tmpnewtuple : newtuple, entry);I do not like the fact that, first we have deformed the tuples and we
are again using the HeapTuple
for expression evaluation machinery and later the expression
evaluation we do the deform again.So why don't you use the deformed tuple as it is to store as a virtual tuple?
Infact, if newtup_changed is true then you are forming back the tuple
just to get it deformed again
in the expression evaluation.I think I have already given this comment on the last version.
Right, I only used the deformed tuple later when it was written to the
stream. I will modify this as well.
regards,
Ajin Cherian
Fujitsu Australia
On Tue, Oct 12, 2021 at 1:33 PM Ajin Cherian <itsajin@gmail.com> wrote:
I do not like the fact that, first we have deformed the tuples and we
are again using the HeapTuple
for expression evaluation machinery and later the expression
evaluation we do the deform again.So why don't you use the deformed tuple as it is to store as a virtual tuple?
Infact, if newtup_changed is true then you are forming back the tuple
just to get it deformed again
in the expression evaluation.I think I have already given this comment on the last version.
Right, I only used the deformed tuple later when it was written to the
stream. I will modify this as well.
I have made the change to use the virtual slot for expression
evaluation and avoided tuple deformation.
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v32-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v32-0001-Row-filter-for-logical-replication.patchDownload
From 8592f65490988b888934b7728f5d63bd4d100b42 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 00:24:04 -0400
Subject: [PATCH v32] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 997 insertions(+), 78 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index fd6910d..43bc11f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9cd0c82..1d0f77d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -178,22 +181,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -210,10 +219,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -227,6 +256,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -243,6 +278,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 9c7f916..747f388 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -391,38 +391,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -552,9 +538,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -562,22 +549,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -594,7 +605,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -629,7 +645,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -656,6 +678,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -676,7 +700,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -705,11 +729,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -719,7 +741,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 228387e..a69e131 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4964,6 +4964,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 800f588..7a33695 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3137,6 +3137,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08f1bf1..ceeb795 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a485fb2..0f4892c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,6 +4141,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4151,9 +4152,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4162,6 +4170,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4202,6 +4211,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4234,8 +4247,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index a33d77c..83249e8 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6329,8 +6329,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6359,6 +6366,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 82f2536..e5df91e 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -116,7 +118,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877..1dfdaf3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3640,6 +3640,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3647,7 +3648,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3660,7 +3661,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 82bce9b..a75ab3c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -167,6 +167,77 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index e5745d5..5b22d51 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -100,6 +100,38 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v32-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v32-0003-PS-ExprState-cache-modifications.patchDownload
From c7f02408816b253dbbadd01c40d571d3459c3cbb Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 00:33:05 -0400
Subject: [PATCH v32] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1220203..ce5e1c5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means that exprstate_list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache then in the
+ * entry->exprstate_list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstate_list == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstate_list)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1318,10 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1333,7 +1443,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1347,22 +1456,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1372,9 +1465,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1434,33 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1567,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate_list != NIL)
+ {
+ list_free_deep(entry->exprstate_list);
+ entry->exprstate_list = NIL;
+ }
}
}
@@ -1607,12 +1685,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v32-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v32-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 2961710963d3cc69e3a6b4723528b1c96bc481ae Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 00:29:44 -0400
Subject: [PATCH v32] PS - Add tab auto-complete support for the Row Filter
WHERE.
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ecae9df..bd35d19 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
v32-0005-PS-POC-Row-filter-validation-walker.patchapplication/octet-stream; name=v32-0005-PS-POC-Row-filter-validation-walker.patchDownload
From 8a73d9854b69980f3e64b4b14b3bfc6792424672 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 04:31:05 -0400
Subject: [PATCH v32] PS - POC Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filer expression are permitted. Specifially:
- no user-defined operators.
- no functions.
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr
This POC patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to the modified validation error messages.
---
src/backend/catalog/dependency.c | 68 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 14 +++++--
src/backend/parser/parse_agg.c | 5 ++-
src/backend/parser/parse_expr.c | 6 ++-
src/backend/parser/parse_func.c | 3 ++
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 17 +++++---
src/test/regress/sql/publication.sql | 2 +
9 files changed, 107 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index e81f093..405b3cd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,68 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Nothing more complicated is permitted. Specifically, no functions of any kind
+ * and no user-defined operators.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ forbidden = _("function calls are not allowed");
+ }
+ else
+ {
+ elog(DEBUG1, "row filter contained something unexpected: %s", nodeToString(node));
+ forbidden = _("too complex");
+ }
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden),
+ errhint("only simple expressions using columns and constants are allowed")
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return what
* is found as a list of RfCol. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index eac7449..e49b3ca 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -144,7 +144,7 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Walk the parse-tree to decide if the row-filter is valid or not.
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
@@ -152,6 +152,14 @@ rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
/*
* Rule:
*
+ * Walk the parse-tree and reject anything more complicated than a very
+ * simple expression.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule:
+ *
* If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in
* the row-filter WHERE clause.
@@ -305,13 +313,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2c7310e..dd69aff 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -156,6 +155,7 @@ typedef struct RfCol {
int attnum;
} RfCol;
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e7c8c19..e10adc8 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -218,16 +218,21 @@ Tables:
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl4"
+DETAIL: function calls are not allowed
+HINT: only simple expressions using columns and constants are allowed
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: too complex
+HINT: only simple expressions using columns and constants are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 6701d50..a30657b 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -123,6 +123,8 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
--
1.8.3.1
v32-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v32-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 82e14c19dff9f17e2a511a84514f4a8d5a450a92 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 6 Oct 2021 04:11:40 -0400
Subject: [PATCH v32] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 58 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 74 ++++++++++++++++++++++-
src/include/catalog/dependency.h | 6 ++
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 79 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 309 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..e81f093 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,64 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 1d0f77d..eac7449 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -141,9 +141,76 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Walk the parse-tree to decide if the row-filter is valid or not.
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY", colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -242,6 +309,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..2c7310e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,12 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index a75ab3c..e7c8c19 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -172,13 +172,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -188,7 +190,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -199,7 +201,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -210,7 +212,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -238,6 +240,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5b22d51..6701d50 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -105,7 +105,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -132,6 +134,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v32-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v32-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 99e1e0c593071067702a04b54177679d808cfac3 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 13 Oct 2021 04:23:36 -0400
Subject: [PATCH v32] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
---
src/backend/replication/logical/proto.c | 122 ++++++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 219 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
5 files changed, 336 insertions(+), 19 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce5e1c5..46fe886 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -133,6 +133,8 @@ typedef struct RelationSyncEntry
bool rowfilter_valid;
List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* tuple table slot for storing deformed new tuple */
+ TupleTableSlot *old_tuple; /* tuple table slot for storing deformed old tuple */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +169,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +742,101 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+ new_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+ tmp_new_slot = MakeSingleTupleTableSlot(desc, &TTSOpsVirtual);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+ entry->old_tuple = old_slot;
+ entry->new_tuple = new_slot;
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, entry->new_tuple);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -769,9 +860,13 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = NULL;
+ entry->new_tuple = NULL;
+
MemoryContextSwitchTo(oldctx);
/*
@@ -846,6 +941,76 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, using virtual slots.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate_list)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstate_list == NIL)
@@ -941,6 +1106,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1117,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1148,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,8 +1175,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->old_tuple && relentry->new_tuple)
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1016,7 +1207,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dc9becc..742bbbe 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Wed, Oct 13, 2021 at 10:00 PM Ajin Cherian <itsajin@gmail.com> wrote:
I have made the change to use the virtual slot for expression
evaluation and avoided tuple deformation.
I started looking at the v32-0006 patch and have some initial comments.
Shouldn't old_slot, new_slot and tmp_new_slot be cached in the
RelationSyncEntry, similar to scantuple?
Currently, these slots are always getting newly allocated each call to
pgoutput_row_filter_update() - and also, seemingly never deallocated.
We previously found that allocating slots each time for each row
filtered (over 1000s of rows) had a huge performance overhead.
As an example, scantuple was originally newly allocated each row
filtered, and to filter 1,000,000 rows in a test case it was taking 40
seconds. Caching the allocation in RelationSyncEntry reduced it down
to about 5 seconds.
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Oct 15, 2021 at 3:30 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Wed, Oct 13, 2021 at 10:00 PM Ajin Cherian <itsajin@gmail.com> wrote:
I have made the change to use the virtual slot for expression
evaluation and avoided tuple deformation.I started looking at the v32-0006 patch and have some initial comments.
Shouldn't old_slot, new_slot and tmp_new_slot be cached in the
RelationSyncEntry, similar to scantuple?
Currently, these slots are always getting newly allocated each call to
pgoutput_row_filter_update() - and also, seemingly never deallocated.
We previously found that allocating slots each time for each row
filtered (over 1000s of rows) had a huge performance overhead.
As an example, scantuple was originally newly allocated each row
filtered, and to filter 1,000,000 rows in a test case it was taking 40
seconds. Caching the allocation in RelationSyncEntry reduced it down
to about 5 seconds.
Thanks for the comment, I have modified patch 6 to cache old_tuple,
new_tuple and tmp_new_tuple.
On Tue, Oct 12, 2021 at 1:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) && + (!old_slot->tts_isnull[i] && + !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i])))) + { + tmp_new_slot->tts_values[i] = old_slot->tts_values[i]; + newtup_changed = true; + }If the attribute is stored EXTERNAL_ONDIS on the new tuple and it is
not null in the old tuple then it must be logged completely in the old
tuple, so instead of checking
!(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]), it should be
asserted,
Sorry, I missed this in my last update
For this to be true, shouldn't the fix in [1]/messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com be committed? I will
change this once that change is committed.
[1]: /messages/by-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v33-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v33-0001-Row-filter-for-logical-replication.patchDownload
From e6d9172936179cffacce726f8f1caf1bdf53ca45 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 19 Oct 2021 06:55:05 -0400
Subject: [PATCH v33] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 997 insertions(+), 78 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index fd6910d..43bc11f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9cd0c82..1d0f77d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -178,22 +181,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -210,10 +219,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -227,6 +256,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -243,6 +278,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 9c7f916..747f388 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -391,38 +391,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -552,9 +538,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -562,22 +549,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -594,7 +605,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -629,7 +645,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -656,6 +678,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -676,7 +700,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -705,11 +729,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -719,7 +741,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 70e9e54..1f0904d 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4816,6 +4816,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 19eff20..84117bf 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2300,6 +2300,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08f1bf1..ceeb795 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 6ec524f..076cf9b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,6 +4141,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4151,9 +4152,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4162,6 +4170,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4202,6 +4211,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4234,8 +4247,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea4ca5c..42c43c3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6328,8 +6328,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6358,6 +6365,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 82f2536..e5df91e 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -116,7 +118,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877..1dfdaf3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3640,6 +3640,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3647,7 +3648,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3660,7 +3661,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 82bce9b..a75ab3c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -167,6 +167,77 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index e5745d5..5b22d51 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -100,6 +100,38 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v33-0005-PS-POC-Row-filter-validation-walker.patchapplication/octet-stream; name=v33-0005-PS-POC-Row-filter-validation-walker.patchDownload
From 7c09af480e3a5d2ba65d6ea19740a209a8637143 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 19 Oct 2021 07:14:00 -0400
Subject: [PATCH v33] PS - POC Row filter validation walker.
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filer expression are permitted. Specifially:
- no user-defined operators.
- no functions.
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr
This POC patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to the modified validation error messages.
---
src/backend/catalog/dependency.c | 68 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 14 +++++--
src/backend/parser/parse_agg.c | 5 ++-
src/backend/parser/parse_expr.c | 6 ++-
src/backend/parser/parse_func.c | 3 ++
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 17 +++++---
src/test/regress/sql/publication.sql | 2 +
9 files changed, 107 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index e81f093..405b3cd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,68 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Nothing more complicated is permitted. Specifically, no functions of any kind
+ * and no user-defined operators.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ forbidden = _("function calls are not allowed");
+ }
+ else
+ {
+ elog(DEBUG1, "row filter contained something unexpected: %s", nodeToString(node));
+ forbidden = _("too complex");
+ }
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden),
+ errhint("only simple expressions using columns and constants are allowed")
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return what
* is found as a list of RfCol. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index eac7449..e49b3ca 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -144,7 +144,7 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
* Walk the parse-tree to decide if the row-filter is valid or not.
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
@@ -152,6 +152,14 @@ rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
/*
* Rule:
*
+ * Walk the parse-tree and reject anything more complicated than a very
+ * simple expression.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule:
+ *
* If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in
* the row-filter WHERE clause.
@@ -305,13 +313,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2c7310e..dd69aff 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -156,6 +155,7 @@ typedef struct RfCol {
int attnum;
} RfCol;
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e7c8c19..e10adc8 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -218,16 +218,21 @@ Tables:
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl4"
+DETAIL: function calls are not allowed
+HINT: only simple expressions using columns and constants are allowed
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+HINT: only simple expressions using columns and constants are allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: too complex
+HINT: only simple expressions using columns and constants are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 6701d50..a30657b 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -123,6 +123,8 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
--
1.8.3.1
v33-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v33-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 4f524ae704591fed3fb5e2903ccd157b0dc5c54a Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 19 Oct 2021 06:58:10 -0400
Subject: [PATCH v33] PS - Add tab auto-complete support for the Row Filter
WHERE.
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ecae9df..bd35d19 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
v33-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v33-0003-PS-ExprState-cache-modifications.patchDownload
From abe008d51a62d1db6595c5586f3fe10e62e8f7b8 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 19 Oct 2021 06:59:44 -0400
Subject: [PATCH v33] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1220203..ce5e1c5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means that exprstate_list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstate_list; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache then in the
+ * entry->exprstate_list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate_list = lappend(entry->exprstate_list, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstate_list == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstate_list)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1318,10 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1333,7 +1443,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1347,22 +1456,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1372,9 +1465,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1434,33 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1567,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate_list != NIL)
+ {
+ list_free_deep(entry->exprstate_list);
+ entry->exprstate_list = NIL;
+ }
}
}
@@ -1607,12 +1685,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v33-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v33-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From b3e6f3e722b83bb4d652fcc2fd69960fe18861ba Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 19 Oct 2021 07:10:19 -0400
Subject: [PATCH v33] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 58 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 74 ++++++++++++++++++++++-
src/include/catalog/dependency.h | 6 ++
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 79 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 309 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..e81f093 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,64 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return what
+ * is found as a list of RfCol. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcol_list = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ RfCol *rfcol;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ rfcol = palloc(sizeof(RfCol));
+ rfcol->name = get_attname(thisobj->objectId, thisobj->objectSubId, false);
+ rfcol->attnum = thisobj->objectSubId;
+
+ rfcol_list = lappend(rfcol_list, rfcol);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcol_list;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 1d0f77d..eac7449 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -141,9 +141,76 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Walk the parse-tree to decide if the row-filter is valid or not.
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule:
+ *
+ * If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ RfCol *rfcol = lfirst(lc);
+ char *colname = rfcol->name;
+ int attnum = rfcol->attnum;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY", colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free_deep(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -242,6 +309,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..2c7310e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,12 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+typedef struct RfCol {
+ char *name;
+ int attnum;
+} RfCol;
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index a75ab3c..e7c8c19 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -172,13 +172,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -188,7 +190,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -199,7 +201,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -210,7 +212,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -238,6 +240,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5b22d51..6701d50 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -105,7 +105,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -132,6 +134,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v33-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v33-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From ce10151674e3c53fdf845d5db0a91c582b4d7e70 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 19 Oct 2021 22:33:26 -0400
Subject: [PATCH v33] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
---
src/backend/replication/logical/proto.c | 122 +++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 264 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
5 files changed, 375 insertions(+), 25 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ce5e1c5..5ecfd63 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -132,7 +132,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
List *exprstate_list; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +170,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +743,103 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -760,7 +854,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -769,9 +863,31 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -846,6 +962,76 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate_list == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate_list)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstate_list == NIL)
@@ -941,6 +1127,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1138,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1169,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,8 +1195,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->new_tuple != NULL && !TTS_EMPTY(relentry->new_tuple))
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1016,7 +1227,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1431,6 +1642,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate_list = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
@@ -1635,10 +1849,20 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
* Row filter cache cleanups. (Will be rebuilt later if needed).
*/
entry->rowfilter_valid = false;
- if (entry->scantuple != NULL)
+ if (entry->new_tuple != NULL)
{
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
}
if (entry->exprstate_list != NIL)
{
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dc9becc..742bbbe 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Thu, Sep 23, 2021 at 10:33 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
11) extra (unnecessary) parens in the deparsed expression
test=# alter publication p add table t where ((b < 100) and (c < 100));
ALTER PUBLICATION
test=# \dRp+ p
Publication p
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
user | f | t | t | t | t | f
Tables:
"public.t" WHERE (((b < 100) AND (c < 100)))
I also reported the same as this some months back, but at that time it
was rejected citing some pg_dump patch. (Please see [1]/messages/by-id/532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com #14).
------
[1]: /messages/by-id/532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
PSA new set of patches:
v34-0001 = the "main" patch from Euler. No change
v34-0002 = tab auto-complete. No change
v34-0003 = cache updates. Addresses Tomas review comment #3 [1]/messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com.
v34-0004 = filter validation replica identity. Addresses Tomas review
comment #8 and #9 [1]/messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com.
v34-0005 = filter validation walker. Addresses Tomas review comment #6 [1]/messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
v34-0006 = support old/new tuple logic for row-filters. Modified, but
no functional change.
------
[1]: /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v34-0005-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v34-0005-PS-Row-filter-validation-walker.patchDownload
From e86c507c0cc40a3a6003a0926e500f30b300330f Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 26 Oct 2021 13:44:13 +1100
Subject: [PATCH v34] PS - Row filter validation walker.
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to modified validation messages and rules.
---
src/backend/catalog/dependency.c | 93 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 20 ++++---
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 26 ++++++---
src/test/regress/sql/publication.sql | 12 +++-
9 files changed, 149 insertions(+), 20 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 63bd378..d294a50 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -132,6 +132,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1554,6 +1560,93 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * User-defined operators are not allowed.
+ * User-defined functions are not allowed.
+ * System functions that are not IMMUTABLE are not allowed.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return them
* as a list of attribute numbers. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 064229f..dc2f459 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -143,20 +143,26 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
/*
* Decide if the row-filter is valid according to the following rules:
*
- * Rule 1. If the publish operation contains "delete" then only columns that
+ * Rule 1. Walk the parse-tree and reject anything other than very simple
+ * expressions. (See rowfilter_validator for details what is permitted).
+ *
+ * Rule 2. If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
- *
- * Rule 2. TODO
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
/*
- * Rule 1: For "delete", check that filter cols are also valid replica
+ * Rule 1. Walk the parse-tree and reject anything unexpected.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule 2: For "delete", check that filter cols are also valid replica
* identity cols.
*
* TODO - check later for publish "update" case.
@@ -308,13 +314,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 323f658..c3e33b9 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -152,6 +151,7 @@ extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e7c8c19..4dca6aa 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -216,18 +216,27 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: syntax error at or near "WHERE"
@@ -240,6 +249,7 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 6701d50..fd84013 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -117,12 +117,19 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
\dRp+ testpub5
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
@@ -133,6 +140,7 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
--
1.8.3.1
v34-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v34-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 0bfc66d45ecc7b2508f8156169ed06c5b54f9504 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 25 Oct 2021 09:19:30 +1100
Subject: [PATCH v34] PS - Add tab auto-complete support for the Row Filter
WHERE.
Following auto-completes are added:
Complete "CREATE PUBLICATION <name> FOR TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> ADD TABLE <name>" with "WHERE (".
Complete "ALTER PUBLICATION <name> SET TABLE <name>" with "WHERE (".
---
src/bin/psql/tab-complete.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ecae9df..bd35d19 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1648,6 +1648,11 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET", "TABLE", MatchAny)
+ || Matches("ALTER", "PUBLICATION", MatchAny, "ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
/* ALTER PUBLICATION <name> SET ( */
else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("publish", "publish_via_partition_root");
@@ -2693,9 +2698,10 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLE", "ALL TABLES");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL"))
COMPLETE_WITH("TABLES");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES")
- || Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
--
1.8.3.1
v34-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v34-0003-PS-ExprState-cache-modifications.patchDownload
From d553f364ccc48e44e73f2663a9ef72c866b85b1b Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 25 Oct 2021 10:16:41 +1100
Subject: [PATCH v34] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1220203..185e8d0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means the exprstates list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstates; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache them in the
+ * entry->exprstates list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstates = lappend(entry->exprstates, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstates == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstates)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1318,10 +1427,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstates = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1333,7 +1443,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1347,22 +1456,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1372,9 +1465,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1434,33 +1524,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1567,6 +1630,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstates != NIL)
+ {
+ list_free_deep(entry->exprstates);
+ entry->exprstates = NIL;
+ }
}
}
@@ -1607,12 +1685,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v34-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v34-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 144dc34b1b870e8d43e31d6f947a3173ed6d335b Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 25 Oct 2021 15:34:03 +1100
Subject: [PATCH v34] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 55 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 77 +++++++++++++++++++++++-
src/include/catalog/dependency.h | 2 +
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 79 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 305 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 91c3e97..63bd378 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1554,6 +1554,61 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return them
+ * as a list of attribute numbers. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcols = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ AttrNumber attnum;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ attnum = thisobj->objectSubId;
+ rfcols = lappend_int(rfcols, attnum);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcols;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 1d0f77d..064229f 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -141,9 +141,79 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Decide if the row-filter is valid according to the following rules:
+ *
+ * Rule 1. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
+ *
+ * Rule 2. TODO
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule 1: For "delete", check that filter cols are also valid replica
+ * identity cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ int attnum = lfirst_int(lc);
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -242,6 +312,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2885f35..323f658 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -151,6 +151,8 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index a75ab3c..e7c8c19 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -172,13 +172,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -188,7 +190,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -199,7 +201,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -210,7 +212,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -238,6 +240,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5b22d51..6701d50 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -105,7 +105,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -132,6 +134,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 6428f0d..dc9becc 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v34-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v34-0001-Row-filter-for-logical-replication.patchDownload
From 1d4cf0641cc46d477ca22954e77df52d0325f242 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 25 Oct 2021 09:15:27 +1100
Subject: [PATCH v34] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 11 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 104 ++++++----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 5 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 5 +-
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 997 insertions(+), 78 deletions(-)
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index fd6910d..43bc11f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6233,6 +6233,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index faa114b..4bb4314 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> DROP TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER PUBLICATION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
@@ -92,7 +92,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ff82fbc..8f78fbb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
- [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ...]
+ [ FOR TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ...]
| FOR ALL TABLES ]
[ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
</synopsis>
@@ -71,6 +71,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -183,6 +187,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -197,6 +216,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -210,6 +234,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9cd0c82..1d0f77d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -34,6 +34,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -178,22 +181,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -210,10 +219,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -227,6 +256,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -243,6 +278,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 9c7f916..747f388 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -391,38 +391,24 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
List *delrels = NIL;
ListCell *oldlc;
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -552,9 +538,10 @@ RemovePublicationById(Oid pubid)
}
/*
- * Open relations specified by a PublicationTable list.
- * In the returned list of PublicationRelInfo, tables are locked
- * in ShareUpdateExclusiveLock mode in order to add them to a publication.
+ * Open relations specified by a RangeVar list (PublicationTable or Relation).
+ *
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
*/
static List *
OpenTableList(List *tables)
@@ -562,22 +549,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -594,7 +605,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -629,7 +645,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -656,6 +678,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -676,7 +700,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -705,11 +729,9 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
foreach(lc, rels)
{
PublicationRelInfo *pubrel = (PublicationRelInfo *) lfirst(lc);
- Relation rel = pubrel->relation;
- Oid relid = RelationGetRelid(rel);
prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubrel->relid),
ObjectIdGetDatum(pubid));
if (!OidIsValid(prid))
{
@@ -719,7 +741,7 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("relation \"%s\" is not part of the publication",
- RelationGetRelationName(rel))));
+ RelationGetRelationName(pubrel->relation))));
}
ObjectAddressSet(obj, PublicationRelRelationId, prid);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 70e9e54..1f0904d 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4816,6 +4816,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 19eff20..84117bf 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2300,6 +2300,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08f1bf1..ceeb795 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9637,10 +9637,11 @@ publication_table_list:
{ $$ = lappend($1, $3); }
;
-publication_table: relation_expr
+publication_table: relation_expr OptWhereClause
{
PublicationTable *n = makeNode(PublicationTable);
n->relation = $1;
+ n->whereClause = $2;
$$ = (Node *) n;
}
;
@@ -9681,7 +9682,7 @@ AlterPublicationStmt:
n->tableAction = DEFELEM_SET;
$$ = (Node *)n;
}
- | ALTER PUBLICATION name DROP TABLE publication_table_list
+ | ALTER PUBLICATION name DROP TABLE relation_expr_list
{
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 14d737f..1220203 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1113,9 +1294,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1138,6 +1320,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1149,6 +1333,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *pubids = GetRelationPublications(relid);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1162,6 +1347,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1171,6 +1372,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1230,9 +1434,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1350,6 +1578,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1359,6 +1588,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1376,7 +1607,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ed8ed2f..5931067 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4141,6 +4141,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4151,9 +4152,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4162,6 +4170,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4202,6 +4211,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationTable(&(pubrinfo[j].dobj), fout);
@@ -4234,8 +4247,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 29af845..f932a70 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -629,6 +629,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea4ca5c..42c43c3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6328,8 +6328,15 @@ describePublications(const char *pattern)
if (!puballtables)
{
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
@@ -6358,6 +6365,10 @@ describePublications(const char *pattern)
PQgetvalue(tabres, j, 0),
PQgetvalue(tabres, j, 1));
+ if (!PQgetisnull(tabres, j, 2))
+ appendPQExpBuffer(&buf, " WHERE (%s)",
+ PQgetvalue(tabres, j, 2));
+
printTableAddFooter(&cont, buf.data);
}
PQclear(tabres);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 82f2536..e5df91e 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -116,7 +118,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877..1dfdaf3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3640,6 +3640,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
typedef struct CreatePublicationStmt
@@ -3647,7 +3648,7 @@ typedef struct CreatePublicationStmt
NodeTag type;
char *pubname; /* Name of the publication */
List *options; /* List of DefElem nodes */
- List *tables; /* Optional list of tables to add */
+ List *tables; /* Optional list of PublicationTable to add */
bool for_all_tables; /* Special publication for all tables in db */
} CreatePublicationStmt;
@@ -3660,7 +3661,7 @@ typedef struct AlterPublicationStmt
List *options; /* List of DefElem nodes */
/* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
- List *tables; /* List of tables to add/drop */
+ List *tables; /* List of PublicationTable to add/drop */
bool for_all_tables; /* Special publication for all tables in db */
DefElemAction tableAction; /* What action to perform with the tables */
} AlterPublicationStmt;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 82bce9b..a75ab3c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -167,6 +167,77 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...R PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e <...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index e5745d5..5b22d51 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -100,6 +100,38 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..6428f0d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v34-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v34-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 1eec372b23c8b7680184585b2d28bcdd26c79b2e Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 26 Oct 2021 14:56:10 +1100
Subject: [PATCH v34] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
---
src/backend/replication/logical/proto.c | 122 +++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 263 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
5 files changed, 377 insertions(+), 22 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 185e8d0..17b4093 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -132,7 +132,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
List *exprstates; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +170,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +743,103 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstates == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -760,7 +854,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -769,9 +863,31 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -846,6 +962,76 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstates == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstates)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstates == NIL)
@@ -941,6 +1127,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1138,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1169,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,8 +1195,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->new_tuple != NULL && !TTS_EMPTY(relentry->new_tuple))
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1016,7 +1227,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1431,6 +1642,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstates = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
@@ -1640,6 +1854,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
if (entry->exprstates != NIL)
{
list_free_deep(entry->exprstates);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dc9becc..742bbbe 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Thu, Sep 23, 2021 at 10:33 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
7) exprstate_list
I'd just call the field / variable "exprstates", without indicating the
data type. I don't think we do that anywhere.
Fixed in v34. [1]/messages/by-id/CAHut+PvWk4w+NEAqB32YkQa75tSkXi50cq6suV9f3fASn5C9NA@mail.gmail.com
8) RfCol
Do we actually need this struct? Why not to track just name or attnum,
and lookup the other value in syscache when needed?
Fixed in v34. [1]/messages/by-id/CAHut+PvWk4w+NEAqB32YkQa75tSkXi50cq6suV9f3fASn5C9NA@mail.gmail.com
9) rowfilter_expr_checker
* Walk the parse-tree to decide if the row-filter is valid or not.
I don't see any clear explanation what does "valid" mean.
Updated comment in v34. [1]/messages/by-id/CAHut+PvWk4w+NEAqB32YkQa75tSkXi50cq6suV9f3fASn5C9NA@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PvWk4w+NEAqB32YkQa75tSkXi50cq6suV9f3fASn5C9NA@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Mon, Sep 27, 2021 at 2:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Sep 25, 2021 at 3:36 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 9/25/21 6:23 AM, Amit Kapila wrote:
On Sat, Sep 25, 2021 at 3:30 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 9/24/21 7:20 AM, Amit Kapila wrote:
I think the right way to support functions is by the explicit marking
of functions and in one of the emails above Jeff Davis also agreed
with the same. I think we should probably introduce a new marking for
this. I feel this is important because without this it won't be safe
to access even some of the built-in functions that can access/update
database (non-immutable functions) due to logical decoding environment
restrictions.I agree that seems reasonable. Is there any reason why not to just use
IMMUTABLE for this purpose? Seems like a good match to me.It will just solve one part of the puzzle (related to database access)
but it is better to avoid the risk of broken replication by explicit
marking especially for UDFs or other user-defined objects. You seem to
be okay documenting such risk but I am not sure we have an agreement
on that especially because that was one of the key points of
discussions in this thread and various people told that we need to do
something about it. I personally feel we should do something if we
want to allow user-defined functions or operators because as reported
in the thread this problem has been reported multiple times. I think
we can go ahead with IMMUTABLE built-ins for the first version and
then allow UDFs later or let's try to find a way for explicit marking.Well, I know multiple people mentioned that issue. And I certainly agree
just documenting the risk would not be an ideal solution. Requiring the
functions to be labeled helps, but we've seen people marking volatile
functions as immutable in order to allow indexing, so we'll have to
document the risks anyway.All I'm saying is that allowing built-in functions/operators but not
user-defined variants seems like an annoying break of extensibility.
People are used that user-defined stuff can be used just like built-in
functions and operators.I agree with you that allowing UDFs in some way would be good for this
feature. I think once we get the base feature committed then we can
discuss whether and how to allow UDFs. Do we want to have an
additional label for it or can we come up with something which allows
the user to continue replication even if she has dropped the object
used in the function? It seems like we can limit the scope of base
patch functionality to allow the use of immutable built-in functions
in row filter expressions.
OK, immutable system functions are now allowed in v34 [1]/messages/by-id/CAHut+PvWk4w+NEAqB32YkQa75tSkXi50cq6suV9f3fASn5C9NA@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PvWk4w+NEAqB32YkQa75tSkXi50cq6suV9f3fASn5C9NA@mail.gmail.com
Kind Regards,
Peter Smith
Fujitsu Australia
On Tue, Oct 26, 2021 at 3:24 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of patches:
v34-0001 = the "main" patch from Euler. No change
v34-0002 = tab auto-complete. No change
v34-0003 = cache updates. Addresses Tomas review comment #3 [1].
v34-0004 = filter validation replica identity. Addresses Tomas review
comment #8 and #9 [1].v34-0005 = filter validation walker. Addresses Tomas review comment #6 [1]
v34-0006 = support old/new tuple logic for row-filters. Modified, but
no functional change.------
[1] /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
A few comments for some things I have noticed so far:
1) scantuple cleanup seems to be missing since the v33-0001 patch.
2) I don't think that the ResetExprContext() calls (before
FreeExecutorState()) are needed in the pgoutput_row_filter() and
pgoutput_row_filter_virtual() functions.
3) make check-world fails, due to recent changes to PostgresNode.pm.
I found that the following updates are needed:
diff --git a/src/test/subscription/t/025_row_filter.pl
b/src/test/subscription/t/025_row_filter.pl
index 742bbbe8a8..3fc503f2e4 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -1,17 +1,17 @@
# Test logical replication behavior with row filtering
use strict;
use warnings;
-use PostgresNode;
-use TestLib;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
use Test::More tests => 7;
# create publisher node
-my $node_publisher = PostgresNode->new('publisher');
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
$node_publisher->init(allows_streaming => 'logical');
$node_publisher->start;
# create subscriber node
-my $node_subscriber = PostgresNode->new('subscriber');
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init(allows_streaming => 'logical');
$node_subscriber->start;
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Oct 26, 2021 at 3:24 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of patches:
v34-0001 = the "main" patch from Euler. No change
v34-0002 = tab auto-complete. No change
v34-0003 = cache updates. Addresses Tomas review comment #3 [1].
v34-0004 = filter validation replica identity. Addresses Tomas review
comment #8 and #9 [1].v34-0005 = filter validation walker. Addresses Tomas review comment #6 [1]
v34-0006 = support old/new tuple logic for row-filters. Modified, but
no functional change.
Regarding the v34-0006 patch, shouldn't it also include an update to
the rowfilter_expr_checker() function added by the v34-0002 patch, for
validating the referenced row-filter columns in the case of UPDATE?
I was thinking something like the following (or is it more complex than this?):
diff --git a/src/backend/catalog/pg_publication.c
b/src/backend/catalog/pg_publication.c
index dc2f4597e6..579e727b10 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -162,12 +162,10 @@ rowfilter_expr_checker(Publication *pub,
ParseState *pstate, Node *rfnode, Relat
rowfilter_validator(relname, rfnode);
/*
- * Rule 2: For "delete", check that filter cols are also valid replica
- * identity cols.
- *
- * TODO - check later for publish "update" case.
+ * Rule 2: For "delete" and "update", check that filter cols are also
+ * valid replica identity cols.
*/
- if (pub->pubactions.pubdelete)
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
{
char replica_identity = rel->rd_rel->relreplident;
Regards,
Greg Nancarrow
Fujitsu Australia
The v34* patch set is temporarily broken.
It was impacted quite a lot by the recently committed "schema
publication" patch [1]https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd.
We are actively fixing the full v34* patch set and will re-post it
here as soon as the re-base hurdles can be overcome.
Meanwhile, the small tab-complete patch (which is independent of the
others) is the only patch currently working, so I am attaching it so
at least the cfbot can have something to run.
------
[1]: https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v35-0001-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v35-0001-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 9c41c708cbf1f7ae6c89a128e472494a54b3b4ae Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 28 Oct 2021 11:46:42 +1100
Subject: [PATCH v35] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 8e01f54..c7c765b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2708,10 +2716,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
Here's a rebase of the first 4 patches of the row-filter patch. Some
issues still remain:
1. the following changes for adding OptWhereClause to the
PublicationObjSpec has not been added
as the test cases for this has not been yet rebased:
PublicationObjSpec:
...
+ TABLE relation_expr OptWhereClause
...
+ | ColId OptWhereClause
...
+ | ColId indirection OptWhereClause
...
+ | extended_relation_expr OptWhereClause
2. Changes made to AlterPublicationTables() undid changes that were as
part of the schema publication patch. This needs to be resolved
with the correct approach.
The patch 0005 and 0006 has not yet been rebased but will be updated
in a few days.
regards,
Ajin Cherian
Attachments:
v35-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v35-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 18e24749610beccd4fda12600ea5c7c51ef392b7 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 2 Nov 2021 06:49:32 -0400
Subject: [PATCH v35 2/4] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 8e01f54..c7c765b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2708,10 +2716,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v35-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v35-0001-Row-filter-for-logical-replication.patchDownload
From 4c63c7e8eedb3a885649d9eb223b37a5f222fd34 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 2 Nov 2021 06:36:25 -0400
Subject: [PATCH v35 1/4] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 70 ++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 +++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 71 +++++++
src/test/regress/sql/publication.sql | 32 +++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1024 insertions(+), 48 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ca01d8c..52f6a1c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +259,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +277,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fed83b8..57d08e7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -251,22 +254,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -283,10 +292,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -300,6 +329,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -316,6 +351,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index d1fff13..413f08d 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,6 +529,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+#if 1
+ // FIXME - can we do a better job if integrating this with the schema changes
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
+ foreach(oldlc, oldrelids)
+ {
+ Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
+ }
+#else
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
@@ -565,6 +587,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
delrels = lappend(delrels, pubrel);
}
}
+#endif
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
@@ -899,22 +922,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -931,7 +978,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -966,7 +1018,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -993,6 +1051,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1042,7 +1102,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 82464c9..0f4c64d 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4829,6 +4829,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index d0eb80e..432cf79
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -205,7 +205,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
bool *deferrable, bool *initdeferred, bool *not_valid,
bool *no_inherit, core_yyscan_t yyscanner);
static void preprocess_pubobj_list(List *pubobjspec_list,
- core_yyscan_t yyscanner);
+ core_yyscan_t yyscanner, bool alter_drop);
static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%}
@@ -9633,7 +9633,7 @@ CreatePublicationStmt:
n->pubname = $3;
n->options = $6;
n->pubobjects = (List *)$5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
$$ = (Node *)n;
}
;
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,11 +9673,20 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
| ColId indirection
@@ -9739,7 +9749,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
n->action = DEFELEM_ADD;
$$ = (Node *)n;
}
@@ -9748,7 +9758,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
n->action = DEFELEM_SET;
$$ = (Node *)n;
}
@@ -9757,7 +9767,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, true);
n->action = DEFELEM_DROP;
$$ = (Node *)n;
}
@@ -17310,7 +17320,7 @@ processCASbits(int cas_bits, int location, const char *constrType,
* convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType.
*/
static void
-preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
+preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner, bool alter_drop)
{
ListCell *cell;
PublicationObjSpec *pubobj;
@@ -17341,7 +17351,15 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ /* cannot use row-filter for DROP TABLE from publications */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause && alter_drop)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of row-filter for DROP TABLE"),
+ parser_errposition(pubobj->location));
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..077ae18 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1297,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1141,6 +1323,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1160,6 +1344,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1173,6 +1358,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1182,6 +1383,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1245,9 +1449,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1365,6 +1593,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1603,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1391,7 +1622,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index b9635a9..661fdf6 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4229,6 +4229,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4239,9 +4240,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4250,6 +4258,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4290,6 +4299,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4360,8 +4373,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f9af14b..00636ca 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 0066614..0a7f278 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6320,8 +6320,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6454,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..964c204 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +124,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 49123e2..bb24aec 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 0f4fe4d..cd01d4d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,77 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of row-filter for DROP TABLE
+LINE 1: ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE ...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 85a5302..2be5f74 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,38 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..e806b5d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v35-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v35-0003-PS-ExprState-cache-modifications.patchDownload
From fab0fa44cc08a81f5cda3c1dad07acf60828c690 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 2 Nov 2021 07:11:08 -0400
Subject: [PATCH v35 3/4] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 077ae18..3dfac7d 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means the exprstates list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstates; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache them in the
+ * entry->exprstates list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstates = lappend(entry->exprstates, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstates == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstates)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1321,10 +1430,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstates = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1344,7 +1454,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1358,22 +1467,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1383,9 +1476,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1449,33 +1539,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1582,6 +1645,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstates != NIL)
+ {
+ list_free_deep(entry->exprstates);
+ entry->exprstates = NIL;
+ }
}
}
@@ -1622,12 +1700,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v35-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v35-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 527a90a14fb56468a3a3d5879836d8c1dc4e0dde Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 2 Nov 2021 07:22:43 -0400
Subject: [PATCH v35 4/4] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 55 ++++++++++++++++++
src/backend/catalog/pg_publication.c | 77 +++++++++++++++++++++++-
src/include/catalog/dependency.h | 2 +
src/test/regress/expected/publication.out | 97 +++++++++++++++++++++++++++++--
src/test/regress/sql/publication.sql | 79 ++++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +--
6 files changed, 305 insertions(+), 12 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 9f8eb1a..de7ff90 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1560,6 +1560,61 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return them
+ * as a list of attribute numbers. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcols = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ AttrNumber attnum;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ attnum = thisobj->objectSubId;
+ rfcols = lappend_int(rfcols, attnum);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcols;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 57d08e7..e04e069 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -214,9 +214,79 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Decide if the row-filter is valid according to the following rules:
+ *
+ * Rule 1. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
+ *
+ * Rule 2. TODO
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule 1: For "delete", check that filter cols are also valid replica
+ * identity cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ int attnum = lfirst_int(lc);
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,6 +385,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 3eca295..2427321 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -152,6 +152,8 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index cd01d4d..54562c6 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -244,13 +244,15 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -260,7 +262,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -271,7 +273,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -282,7 +284,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,6 +312,91 @@ DROP TABLE testpub_rf_tbl4;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 2be5f74..82d78ad 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -139,7 +139,9 @@ CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -166,6 +168,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index e806b5d..dff55c2 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
On Tue, Nov 2, 2021 at 10:44 PM Ajin Cherian <itsajin@gmail.com> wrote:
.
The patch 0005 and 0006 has not yet been rebased but will be updated
in a few days.
Here's a rebase of all the 6 patches. Issue remaining:
1. Changes made to AlterPublicationTables() undid changes that were as
part of the schema publication patch. This needs to be resolved
with the correct approach.
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v36-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v36-0001-Row-filter-for-logical-replication.patchDownload
From 99f8a60ae27791bc1da53bafa2a313e623a29df8 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 3 Nov 2021 06:48:56 -0400
Subject: [PATCH v36 1/6] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 70 ++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 50 +++--
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 102 ++++++++++
src/test/regress/sql/publication.sql | 47 +++++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1080 insertions(+), 50 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ca01d8c..52f6a1c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +259,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +277,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fed83b8..57d08e7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -251,22 +254,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -283,10 +292,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -300,6 +329,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -316,6 +351,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index d1fff13..413f08d 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,6 +529,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+#if 1
+ // FIXME - can we do a better job if integrating this with the schema changes
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
+ foreach(oldlc, oldrelids)
+ {
+ Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
+ }
+#else
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
@@ -565,6 +587,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
delrels = lappend(delrels, pubrel);
}
}
+#endif
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
@@ -899,22 +922,46 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ PublicationRelInfo *pub_rel;
/*
* Open, share-lock, and check all the explicitly-specified relations
*/
foreach(lc, tables)
{
- PublicationTable *t = lfirst_node(PublicationTable, lc);
- bool recurse = t->relation->inh;
+ PublicationTable *t = NULL;
+ RangeVar *rv;
+ bool recurse;
Relation rel;
Oid myrelid;
- PublicationRelInfo *pub_rel;
+ bool whereclause;
+
+ /*
+ * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
+ * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
+ * a Relation List. Check the List element to be used.
+ */
+ if (IsA(lfirst(lc), PublicationTable))
+ whereclause = true;
+ else
+ whereclause = false;
+
+ if (whereclause)
+ {
+ t = lfirst(lc);
+ rv = castNode(RangeVar, t->relation);
+ }
+ else
+ {
+ rv = lfirst_node(RangeVar, lc);
+ }
+
+ recurse = rv->inh;
/* Allow query cancel in case this takes a long time */
CHECK_FOR_INTERRUPTS();
- rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+ rel = table_openrv(rv, ShareUpdateExclusiveLock);
myrelid = RelationGetRelid(rel);
/*
@@ -931,7 +978,12 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -966,7 +1018,13 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ if (whereclause)
+ pub_rel->whereClause = t->whereClause;
+ else
+ pub_rel->whereClause = NULL;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -993,6 +1051,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1042,7 +1102,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 82464c9..0f4c64d 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4829,6 +4829,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index d0eb80e..1bffe58
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -205,7 +205,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
bool *deferrable, bool *initdeferred, bool *not_valid,
bool *no_inherit, core_yyscan_t yyscanner);
static void preprocess_pubobj_list(List *pubobjspec_list,
- core_yyscan_t yyscanner);
+ core_yyscan_t yyscanner, bool alter_drop);
static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%}
@@ -9633,7 +9633,7 @@ CreatePublicationStmt:
n->pubname = $3;
n->options = $6;
n->pubobjects = (List *)$5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
$$ = (Node *)n;
}
;
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ if ($3)
+ {
+ $$->pubtable->whereClause = $3;
+ }
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ if ($2)
+ {
+ $$->pubtable->whereClause = $2;
+ }
}
| CURRENT_SCHEMA
{
@@ -9739,7 +9757,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
n->action = DEFELEM_ADD;
$$ = (Node *)n;
}
@@ -9748,7 +9766,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
n->action = DEFELEM_SET;
$$ = (Node *)n;
}
@@ -9757,7 +9775,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, true);
n->action = DEFELEM_DROP;
$$ = (Node *)n;
}
@@ -17310,7 +17328,7 @@ processCASbits(int cas_bits, int location, const char *constrType,
* convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType.
*/
static void
-preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
+preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner, bool alter_drop)
{
ListCell *cell;
PublicationObjSpec *pubobj;
@@ -17341,7 +17359,15 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ /* cannot use WHERE w-filter for DROP TABLE from publications */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause && alter_drop)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE"),
+ parser_errposition(pubobj->location));
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 3cec8de..e946f17 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..077ae18 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1297,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1141,6 +1323,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1160,6 +1344,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1173,6 +1358,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1182,6 +1383,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1245,9 +1449,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1365,6 +1593,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1603,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1391,7 +1622,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index b9635a9..661fdf6 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4229,6 +4229,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4239,9 +4240,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4250,6 +4258,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4290,6 +4299,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4360,8 +4373,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f9af14b..00636ca 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 0066614..0a7f278 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6320,8 +6320,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6454,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..964c204 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +124,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 49123e2..bb24aec 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 0f4fe4d..8b1bcd1 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,108 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+LINE 1: ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE ...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP SCHEMA testpub_rf_myschema;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 85a5302..acf43c8 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,53 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP SCHEMA testpub_rf_myschema;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..e806b5d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v36-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v36-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From d5846417e9a407816fec34f2460df0d4557daf50 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 3 Nov 2021 06:52:44 -0400
Subject: [PATCH v36 2/6] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 8e01f54..c7c765b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2708,10 +2716,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v36-0005-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v36-0005-PS-Row-filter-validation-walker.patchDownload
From 9b5b53181a278d79eef701884a586906b5f4dd65 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 3 Nov 2021 08:11:57 -0400
Subject: [PATCH v36 5/6] PS - Row filter validation walker.
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to modified validation messages and rules.
---
src/backend/catalog/dependency.c | 93 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 20 ++++---
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 25 ++++++---
src/test/regress/sql/publication.sql | 11 +++-
9 files changed, 149 insertions(+), 18 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index de7ff90..fbf8d78 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -133,6 +133,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1560,6 +1566,93 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * User-defined operators are not allowed.
+ * User-defined functions are not allowed.
+ * System functions that are not IMMUTABLE are not allowed.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return them
* as a list of attribute numbers. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index e04e069..f56e01f 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -216,20 +216,26 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
/*
* Decide if the row-filter is valid according to the following rules:
*
- * Rule 1. If the publish operation contains "delete" then only columns that
+ * Rule 1. Walk the parse-tree and reject anything other than very simple
+ * expressions. (See rowfilter_validator for details what is permitted).
+ *
+ * Rule 2. If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
- *
- * Rule 2. TODO
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
/*
- * Rule 1: For "delete", check that filter cols are also valid replica
+ * Rule 1. Walk the parse-tree and reject anything unexpected.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule 2: For "delete", check that filter cols are also valid replica
* identity cols.
*
* TODO - check later for publish "update" case.
@@ -381,13 +387,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index e946f17..de9600f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2427321..e118256 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -153,6 +152,7 @@ extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 72f8dff..79b1955 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -308,17 +308,27 @@ RESET client_min_messages;
DROP PUBLICATION testpub_syntax2;
ERROR: publication "testpub_syntax2" does not exist
-- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
@@ -333,6 +343,7 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 2f291eb..6faa1de 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -165,11 +165,19 @@ RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
-- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
@@ -182,6 +190,7 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
--
1.8.3.1
v36-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v36-0003-PS-ExprState-cache-modifications.patchDownload
From bbea43071179d26b9b0d069f942c73cf47673772 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 3 Nov 2021 06:57:08 -0400
Subject: [PATCH v36 3/6] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 077ae18..3dfac7d 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means the exprstates list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstates; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache them in the
+ * entry->exprstates list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstates = lappend(entry->exprstates, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstates == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstates)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1321,10 +1430,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstates = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1344,7 +1454,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1358,22 +1467,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1383,9 +1476,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1449,33 +1539,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1582,6 +1645,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstates != NIL)
+ {
+ list_free_deep(entry->exprstates);
+ entry->exprstates = NIL;
+ }
}
}
@@ -1622,12 +1700,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v36-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v36-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 8a025a3b61252283e2df478a0a2a99f4b7b4958d Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 3 Nov 2021 07:50:24 -0400
Subject: [PATCH v36 4/6] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 55 ++++++++++++++
src/backend/catalog/pg_publication.c | 77 ++++++++++++++++++-
src/include/catalog/dependency.h | 2 +
src/test/regress/expected/publication.out | 119 ++++++++++++++++++++++++------
src/test/regress/sql/publication.sql | 79 +++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +-
6 files changed, 311 insertions(+), 28 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 9f8eb1a..de7ff90 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1560,6 +1560,61 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return them
+ * as a list of attribute numbers. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcols = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ AttrNumber attnum;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ attnum = thisobj->objectSubId;
+ rfcols = lappend_int(rfcols, attnum);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcols;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 57d08e7..e04e069 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -214,9 +214,79 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Decide if the row-filter is valid according to the following rules:
+ *
+ * Rule 1. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
+ *
+ * Rule 2. TODO
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule 1: For "delete", check that filter cols are also valid replica
+ * identity cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ int attnum = lfirst_int(lc);
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,6 +385,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 3eca295..2427321 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -152,6 +152,8 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 8b1bcd1..72f8dff 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -246,13 +246,15 @@ CREATE TABLE testpub_rf_tbl4 (g text);
CREATE SCHEMA testpub_rf_myschema;
CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -262,7 +264,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -273,7 +275,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -284,37 +286,27 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+ERROR: cannot add relation "testpub_rf_tbl3" to publication
+DETAIL: Row filter column "e" is not part of the REPLICA IDENTITY
RESET client_min_messages;
\dRp+ testpub_syntax1
- Publication testpub_syntax1
- Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
-Tables:
- "public.testpub_rf_tbl1"
- "public.testpub_rf_tbl3" WHERE ((e < 999))
-
DROP PUBLICATION testpub_syntax1;
+ERROR: publication "testpub_syntax1" does not exist
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+ERROR: cannot add relation "testpub_rf_tbl5" to publication
+DETAIL: Row filter column "h" is not part of the REPLICA IDENTITY
RESET client_min_messages;
\dRp+ testpub_syntax2
- Publication testpub_syntax2
- Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
-Tables:
- "public.testpub_rf_tbl1"
- "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
-
DROP PUBLICATION testpub_syntax2;
+ERROR: publication "testpub_syntax2" does not exist
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
ERROR: functions are not allowed in publication WHERE expressions
@@ -341,6 +333,91 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index acf43c8..2f291eb 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -141,7 +141,9 @@ CREATE TABLE testpub_rf_tbl4 (g text);
CREATE SCHEMA testpub_rf_myschema;
CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -181,6 +183,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index e806b5d..dff55c2 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v36-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v36-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 5a6e2e2fb3f84938dc73bf0647c7e22dbc8c9600 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 3 Nov 2021 08:32:21 -0400
Subject: [PATCH v36 6/6] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
---
src/backend/replication/logical/proto.c | 122 +++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 263 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
5 files changed, 377 insertions(+), 22 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3dfac7d..af0b9a8 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -132,7 +132,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
List *exprstates; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +170,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +743,103 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstates == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -760,7 +854,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -769,9 +863,31 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -846,6 +962,76 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstates == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstates)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstates == NIL)
@@ -941,6 +1127,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1138,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1169,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,8 +1195,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->new_tuple != NULL && !TTS_EMPTY(relentry->new_tuple))
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1016,7 +1227,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1434,6 +1645,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstates = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
@@ -1655,6 +1869,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
if (entry->exprstates != NIL)
{
list_free_deep(entry->exprstates);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dff55c2..3fc503f 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Tue, Nov 2, 2021 at 10:44 PM Ajin Cherian <itsajin@gmail.com> wrote:
Here's a rebase of the first 4 patches of the row-filter patch. Some
issues still remain:1. the following changes for adding OptWhereClause to the
PublicationObjSpec has not been added
as the test cases for this has not been yet rebased:PublicationObjSpec: ... + TABLE relation_expr OptWhereClause ... + | ColId OptWhereClause ... + | ColId indirection OptWhereClause ... + | extended_relation_expr OptWhereClause
This is addressed in the v36-0001 patch [1]/messages/by-id/CAFPTHDYKfxTr2zpA-fC12u+hL2abCc=276OpJQUTyc6FBgYX9g@mail.gmail.com
------
[1]: /messages/by-id/CAFPTHDYKfxTr2zpA-fC12u+hL2abCc=276OpJQUTyc6FBgYX9g@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Hi.
During some ad-hoc filter testing I observed a quirk when there are
duplicate tables. I think we need to define/implement some proper
rules for this behaviour.
=====
BACKGROUND
When the same table appears multiple times in a CREATE PUBLICATION
then those duplicates are simply ignored. The end result is that the
table is only one time in the publication.
This is fine and makes no difference where there are no row-filters
(because the duplicates are all exactly the same as each other), but
if there *are* row-filters there there is a quirky behaviour.
=====
PROBLEM
Apparently it is the *first* of the occurrences that is used and all
the other duplicates are ignored.
In practice it looks like this.
ex.1)
DROP PUBLICATION
test_pub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a=1), t1 WHERE (a=2);
CREATE PUBLICATION
test_pub=# \dRp+ p1
Publication p1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
postgres | f | t | t | t | t | f
Tables:
"public.t1" WHERE ((a = 1))
** Notice that the 2nd filter (a=2) was ignored
~
IMO ex1 is wrong behaviour. I think that any subsequent duplicate
table names should behave the same as if the CREATE was a combination
of CREATE PUBLICATION then ALTER PUBLICATION SET.
Like this:
ex.2)
test_pub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a=1);
CREATE PUBLICATION
test_pub=# ALTER PUBLICATION p1 SET TABLE t1 WHERE (a=2);
ALTER PUBLICATION
test_pub=# \dRp+ p1
Publication p1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
postgres | f | t | t | t | t | f
Tables:
"public.t1" WHERE ((a = 2))
** Notice that the 2nd filter (a=2) overwrites the 1st filter (a=1) as expected.
~~
The current behaviour of duplicates becomes even more "unexpected" if
duplicate tables occur in a single ALTER PUBLICATION ... SET command.
ex.3)
test_pub=# CREATE PUBLICATION p1;
CREATE PUBLICATION
test_pub=# ALTER PUBLICATION p1 SET TABLE t1 WHERE (a=1), t1 WHERE (a=2);
ALTER PUBLICATION
test_pub=# \dRp+ p1
Publication p1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
postgres | f | t | t | t | t | f
Tables:
"public.t1" WHERE ((a = 1))
** Notice the 2nd filter (a=2) did not overwrite the 1st filter (a=1).
I think a user would be quite surprised by this behaviour.
=====
PROPOSAL
I propose that we change the way duplicate tables are processed to
make it so that it is always the *last* one that takes effect (instead
of the *first* one). AFAIK doing this won't affect any current PG
behaviour, but doing this will let the new row-filter feature work in
a consistent/predictable/sane way.
Thoughts?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 4, 2021 at 8:17 AM Peter Smith <smithpb2250@gmail.com> wrote:
PROPOSAL
I propose that we change the way duplicate tables are processed to
make it so that it is always the *last* one that takes effect (instead
of the *first* one).
I don't have a good reason to prefer one over another but I think if
we do this then we should document the chosen behavior. BTW, why not
give an error if the duplicate table is present and any one of them or
both have row-filters? I think the current behavior makes sense
because it makes no difference if the table is present more than once
in the list but with row-filter it can make difference so it seems to
me that giving an error should be considered.
--
With Regards,
Amit Kapila.
On Wednesday, November 3, 2021 8:51 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Tue, Nov 2, 2021 at 10:44 PM Ajin Cherian <itsajin@gmail.com> wrote:
.The patch 0005 and 0006 has not yet been rebased but will be updated
in a few days.Here's a rebase of all the 6 patches. Issue remaining:
Thanks for the patches.
I started to review the patches and here are a few comments.
1)
/*
* ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
* (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
* a Relation List. Check the List element to be used.
*/
if (IsA(lfirst(lc), PublicationTable))
whereclause = true;
else
whereclause = false;
I am not sure about the comments here, wouldn't it be better to always provides
PublicationTable List which could be more consistent.
2)
+ if ($3)
+ {
+ $$->pubtable->whereClause = $3;
+ }
It seems we can remove the if ($3) check here.
3)
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstates = lappend(entry->exprstates, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
Currently in the patch, it save and execute each expression separately. I was
thinking it might be better if we can use "AND" to combine all the expressions
into one expression, then we can initialize and optimize the final expression
and execute it only once.
Best regards,
Hou zj
On Thu, Nov 4, 2021 at 2:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 4, 2021 at 8:17 AM Peter Smith <smithpb2250@gmail.com> wrote:
PROPOSAL
I propose that we change the way duplicate tables are processed to
make it so that it is always the *last* one that takes effect (instead
of the *first* one).I don't have a good reason to prefer one over another but I think if
we do this then we should document the chosen behavior. BTW, why not
give an error if the duplicate table is present and any one of them or
both have row-filters? I think the current behavior makes sense
because it makes no difference if the table is present more than once
in the list but with row-filter it can make difference so it seems to
me that giving an error should be considered.
Yes, giving an error if any duplicate table has a filter is also a
good alternative solution.
I only wanted to demonstrate the current problem, and get some
consensus on the solution before implementing a fix. If others are
happy to give an error for this case then that is fine by me too.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
PSA new set of v37* patches.
This addresses some pending review comments as follows:
v34-0001 = the "main" patch.
- fixed Houz review comment #1 [1]Houz 4/11 - /messages/by-id/OS0PR01MB5716090A70A73ADF58C58950948D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
- fixed Houz review comment #2 [1]Houz 4/11 - /messages/by-id/OS0PR01MB5716090A70A73ADF58C58950948D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
- fixed Tomas review comment #5 [2]Tomas 23/9 - /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
v34-0002 = tab auto-complete.
- not changed
v34-0003 = cache updates.
- not changed
v34-0004 = filter validation replica identity.
- not changed
v34-0005 = filter validation walker.
- not changed
v34-0006 = support old/new tuple logic for row-filters.
- Ajin fixed Tomas review comment #14 [2]Tomas 23/9 - /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
- Ajin fixed Greg review comment #1 [3]Greg 26/10 - /messages/by-id/CAJcOf-dNDy=rzUD=2H54J-VVUJCxq94o_2Sqc35RovFLKkSj7Q@mail.gmail.com
- Ajin fixed Greg review comment #2 [3]Greg 26/10 - /messages/by-id/CAJcOf-dNDy=rzUD=2H54J-VVUJCxq94o_2Sqc35RovFLKkSj7Q@mail.gmail.com
- Ajin fixed Greg review comment #3 [3]Greg 26/10 - /messages/by-id/CAJcOf-dNDy=rzUD=2H54J-VVUJCxq94o_2Sqc35RovFLKkSj7Q@mail.gmail.com
- Ajin fixed Greg review comment #1 [4]Greg 27/10 - /messages/by-id/CAJcOf-dViJh-F4oJkMQchAD19LELuCNbCqKfia5S7jsOASO6yA@mail.gmail.com
------
[1]: Houz 4/11 - /messages/by-id/OS0PR01MB5716090A70A73ADF58C58950948D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
/messages/by-id/OS0PR01MB5716090A70A73ADF58C58950948D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]: Tomas 23/9 - /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
/messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
[3]: Greg 26/10 - /messages/by-id/CAJcOf-dNDy=rzUD=2H54J-VVUJCxq94o_2Sqc35RovFLKkSj7Q@mail.gmail.com
/messages/by-id/CAJcOf-dNDy=rzUD=2H54J-VVUJCxq94o_2Sqc35RovFLKkSj7Q@mail.gmail.com
[4]: Greg 27/10 - /messages/by-id/CAJcOf-dViJh-F4oJkMQchAD19LELuCNbCqKfia5S7jsOASO6yA@mail.gmail.com
/messages/by-id/CAJcOf-dViJh-F4oJkMQchAD19LELuCNbCqKfia5S7jsOASO6yA@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v37-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v37-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From ef7d0f25e0f7edaea5fd4cf21ed153e6fe7af02f Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 5 Nov 2021 13:41:50 +1100
Subject: [PATCH v37] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
src/backend/catalog/dependency.c | 55 ++++++++++++++
src/backend/catalog/pg_publication.c | 77 ++++++++++++++++++-
src/include/catalog/dependency.h | 2 +
src/test/regress/expected/publication.out | 119 ++++++++++++++++++++++++------
src/test/regress/sql/publication.sql | 79 +++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +-
6 files changed, 311 insertions(+), 28 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 9f8eb1a..de7ff90 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1560,6 +1560,61 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return them
+ * as a list of attribute numbers. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcols = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ AttrNumber attnum;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ attnum = thisobj->objectSubId;
+ rfcols = lappend_int(rfcols, attnum);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcols;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 57d08e7..e04e069 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -214,9 +214,79 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Decide if the row-filter is valid according to the following rules:
+ *
+ * Rule 1. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
+ *
+ * Rule 2. TODO
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule 1: For "delete", check that filter cols are also valid replica
+ * identity cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ int attnum = lfirst_int(lc);
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,6 +385,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 3eca295..2427321 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -152,6 +152,8 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 8b1bcd1..72f8dff 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -246,13 +246,15 @@ CREATE TABLE testpub_rf_tbl4 (g text);
CREATE SCHEMA testpub_rf_myschema;
CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -262,7 +264,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -273,7 +275,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -284,37 +286,27 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+ERROR: cannot add relation "testpub_rf_tbl3" to publication
+DETAIL: Row filter column "e" is not part of the REPLICA IDENTITY
RESET client_min_messages;
\dRp+ testpub_syntax1
- Publication testpub_syntax1
- Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
-Tables:
- "public.testpub_rf_tbl1"
- "public.testpub_rf_tbl3" WHERE ((e < 999))
-
DROP PUBLICATION testpub_syntax1;
+ERROR: publication "testpub_syntax1" does not exist
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+ERROR: cannot add relation "testpub_rf_tbl5" to publication
+DETAIL: Row filter column "h" is not part of the REPLICA IDENTITY
RESET client_min_messages;
\dRp+ testpub_syntax2
- Publication testpub_syntax2
- Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
-Tables:
- "public.testpub_rf_tbl1"
- "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
-
DROP PUBLICATION testpub_syntax2;
+ERROR: publication "testpub_syntax2" does not exist
-- fail - functions disallowed
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
ERROR: functions are not allowed in publication WHERE expressions
@@ -341,6 +333,91 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index acf43c8..2f291eb 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -141,7 +141,9 @@ CREATE TABLE testpub_rf_tbl4 (g text);
CREATE SCHEMA testpub_rf_myschema;
CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -181,6 +183,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index e806b5d..dff55c2 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v37-0005-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v37-0005-PS-Row-filter-validation-walker.patchDownload
From 775344e68900385b34afe52018f5d32973844429 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 5 Nov 2021 14:19:28 +1100
Subject: [PATCH v37] PS - Row filter validation walker.
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to modified validation messages and rules.
---
src/backend/catalog/dependency.c | 93 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 20 ++++---
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 25 ++++++---
src/test/regress/sql/publication.sql | 11 +++-
9 files changed, 149 insertions(+), 18 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index de7ff90..fbf8d78 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -133,6 +133,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1560,6 +1566,93 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * User-defined operators are not allowed.
+ * User-defined functions are not allowed.
+ * System functions that are not IMMUTABLE are not allowed.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return them
* as a list of attribute numbers. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index e04e069..f56e01f 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -216,20 +216,26 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
/*
* Decide if the row-filter is valid according to the following rules:
*
- * Rule 1. If the publish operation contains "delete" then only columns that
+ * Rule 1. Walk the parse-tree and reject anything other than very simple
+ * expressions. (See rowfilter_validator for details what is permitted).
+ *
+ * Rule 2. If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
- *
- * Rule 2. TODO
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
/*
- * Rule 1: For "delete", check that filter cols are also valid replica
+ * Rule 1. Walk the parse-tree and reject anything unexpected.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule 2: For "delete", check that filter cols are also valid replica
* identity cols.
*
* TODO - check later for publish "update" case.
@@ -381,13 +387,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2427321..e118256 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -153,6 +152,7 @@ extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 72f8dff..79b1955 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -308,17 +308,27 @@ RESET client_min_messages;
DROP PUBLICATION testpub_syntax2;
ERROR: publication "testpub_syntax2" does not exist
-- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
@@ -333,6 +343,7 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 2f291eb..6faa1de 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -165,11 +165,19 @@ RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
-- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
@@ -182,6 +190,7 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
--
1.8.3.1
v37-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v37-0001-Row-filter-for-logical-replication.patchDownload
From 9669c07cc4119f628ccda22af8e80d366a50005a Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 5 Nov 2021 12:29:59 +1100
Subject: [PATCH v37] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 32 ++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 44 ++--
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 102 ++++++++++
src/test/regress/sql/publication.sql | 47 +++++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1040 insertions(+), 46 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ca01d8c..52f6a1c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +259,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +277,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fed83b8..57d08e7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -251,22 +254,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -283,10 +292,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -300,6 +329,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -316,6 +351,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index d1fff13..4a69a4c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,6 +529,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+#if 1
+ // FIXME - can we do a better job if integrating this with the schema changes
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
+ foreach(oldlc, oldrelids)
+ {
+ Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
+ }
+#else
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
@@ -565,6 +587,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
delrels = lappend(delrels, pubrel);
}
}
+#endif
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
@@ -931,7 +954,9 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -966,7 +991,10 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -993,6 +1021,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1042,7 +1072,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 82464c9..0f4c64d 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4829,6 +4829,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index d0eb80e..9e6eb52
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -205,7 +205,7 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
bool *deferrable, bool *initdeferred, bool *not_valid,
bool *no_inherit, core_yyscan_t yyscanner);
static void preprocess_pubobj_list(List *pubobjspec_list,
- core_yyscan_t yyscanner);
+ core_yyscan_t yyscanner, bool alter_drop);
static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%}
@@ -9633,7 +9633,7 @@ CreatePublicationStmt:
n->pubname = $3;
n->options = $6;
n->pubobjects = (List *)$5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
$$ = (Node *)n;
}
;
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,39 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -9739,7 +9751,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
n->action = DEFELEM_ADD;
$$ = (Node *)n;
}
@@ -9748,7 +9760,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, false);
n->action = DEFELEM_SET;
$$ = (Node *)n;
}
@@ -9757,7 +9769,7 @@ AlterPublicationStmt:
AlterPublicationStmt *n = makeNode(AlterPublicationStmt);
n->pubname = $3;
n->pubobjects = $5;
- preprocess_pubobj_list(n->pubobjects, yyscanner);
+ preprocess_pubobj_list(n->pubobjects, yyscanner, true);
n->action = DEFELEM_DROP;
$$ = (Node *)n;
}
@@ -17310,7 +17322,7 @@ processCASbits(int cas_bits, int location, const char *constrType,
* convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType.
*/
static void
-preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
+preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner, bool alter_drop)
{
ListCell *cell;
PublicationObjSpec *pubobj;
@@ -17341,7 +17353,15 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ /* cannot use WHERE w-filter for DROP TABLE from publications */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause && alter_drop)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE"),
+ parser_errposition(pubobj->location));
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..077ae18 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1297,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1141,6 +1323,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1160,6 +1344,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1173,6 +1358,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1182,6 +1383,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1245,9 +1449,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1365,6 +1593,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1603,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1391,7 +1622,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index b9635a9..661fdf6 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4229,6 +4229,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4239,9 +4240,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4250,6 +4258,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4290,6 +4299,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4360,8 +4373,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f9af14b..00636ca 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 0066614..0a7f278 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6320,8 +6320,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6454,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..964c204 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +124,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 49123e2..bb24aec 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 0f4fe4d..8b1bcd1 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,108 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+LINE 1: ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE ...
+ ^
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP SCHEMA testpub_rf_myschema;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 85a5302..acf43c8 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,53 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP SCHEMA testpub_rf_myschema;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..e806b5d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v37-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v37-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From b3c75548486ded788b0c53d16adfe03037d3cce6 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 5 Nov 2021 12:37:58 +1100
Subject: [PATCH v37] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 8e01f54..c7c765b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2708,10 +2716,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v37-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v37-0003-PS-ExprState-cache-modifications.patchDownload
From 0792143d9b4cfc120c5b43ade3814f1393be4918 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 5 Nov 2021 12:42:54 +1100
Subject: [PATCH v37] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState list) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
Changes are based on a suggestions from Amit [1] [2].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 200 +++++++++++++++++++---------
1 file changed, 136 insertions(+), 64 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 077ae18..3dfac7d 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -123,7 +123,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means the exprstates list is correct -
+ * It doesn't mean that there actual is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ List *exprstates; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +169,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +739,121 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState(s) and cache them in the
+ * entry->exprstates list.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if so build the ExprState for it.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstates = lappend(entry->exprstates, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (entry->exprstates == NIL)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -761,7 +870,7 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
* different row filter in these publications, all row filters must be
* matched in order to replicate this change.
*/
- foreach(lc, entry->exprstate)
+ foreach(lc, entry->exprstates)
{
ExprState *exprstate = (ExprState *) lfirst(lc);
@@ -840,7 +949,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +982,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1016,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1321,10 +1430,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstates = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1344,7 +1454,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1358,22 +1467,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1383,9 +1476,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1449,33 +1539,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1582,6 +1645,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstates != NIL)
+ {
+ list_free_deep(entry->exprstates);
+ entry->exprstates = NIL;
+ }
}
}
@@ -1622,12 +1700,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v37-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v37-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From c32812fa4471ee577dd6b41fdcc0611415c45e15 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 5 Nov 2021 15:28:27 +1100
Subject: [PATCH v37] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
---
src/backend/catalog/pg_publication.c | 6 +-
src/backend/replication/logical/proto.c | 122 +++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 263 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
6 files changed, 378 insertions(+), 27 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index f56e01f..44bc4da 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -235,12 +235,10 @@ rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relat
rowfilter_validator(relname, rfnode);
/*
- * Rule 2: For "delete", check that filter cols are also valid replica
+ * Rule 2: For "delete" and "update", check that filter cols are also valid replica
* identity cols.
- *
- * TODO - check later for publish "update" case.
*/
- if (pub->pubactions.pubdelete)
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
{
char replica_identity = rel->rd_rel->relreplident;
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3dfac7d..c970bc6 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -132,7 +132,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
List *exprstates; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +170,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +743,103 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstates == NIL)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ Oid relid = RelationGetRelid(relation);
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -760,7 +854,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -769,9 +863,31 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -846,6 +962,75 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstates == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstates)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ ListCell *lc;
/* Bail out if there is no row filter */
if (entry->exprstates == NIL)
@@ -883,7 +1068,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -941,6 +1125,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -949,7 +1136,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -980,9 +1167,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1005,8 +1193,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->new_tuple != NULL && !TTS_EMPTY(relentry->new_tuple))
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1016,7 +1225,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1434,6 +1643,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstates = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
@@ -1655,6 +1867,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
if (entry->exprstates != NIL)
{
list_free_deep(entry->exprstates);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dff55c2..3fc503f 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Thu, Sep 23, 2021 at 10:33 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
5) publicationcmds.c
I mentioned this in my last review [1] already, but I really dislike the
fact that OpenTableList accepts a list containing one of two entirely
separate node types (PublicationTable or Relation). It was modified to
use IsA() instead of a flag, but I still find it ugly, confusing and
possibly error-prone.Also, not sure mentioning the two different callers explicitly in the
OpenTableList comment is a great idea - it's likely to get stale if
someone adds another caller.
Fixed in v37-0001 [1]/messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
14) pgoutput_row_filter_update
The function name seems a bit misleading, as it suggests might seem like
it updates the row_filter, or something. Should indicate it's about
deciding what to do with the update.
Fixed in v37-0006 [1]/messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Oct 26, 2021 at 6:26 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
A few comments for some things I have noticed so far:
1) scantuple cleanup seems to be missing since the v33-0001 patch.
2) I don't think that the ResetExprContext() calls (before
FreeExecutorState()) are needed in the pgoutput_row_filter() and
pgoutput_row_filter_virtual() functions.3) make check-world fails, due to recent changes to PostgresNode.pm.
These 3 comments all addressed in v37-0006 [1]/messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Wed, Oct 27, 2021 at 7:21 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
Regarding the v34-0006 patch, shouldn't it also include an update to
the rowfilter_expr_checker() function added by the v34-0002 patch, for
validating the referenced row-filter columns in the case of UPDATE?
I was thinking something like the following (or is it more complex than this?):diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c index dc2f4597e6..579e727b10 100644 --- a/src/backend/catalog/pg_publication.c +++ b/src/backend/catalog/pg_publication.c @@ -162,12 +162,10 @@ rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relat rowfilter_validator(relname, rfnode);/* - * Rule 2: For "delete", check that filter cols are also valid replica - * identity cols. - * - * TODO - check later for publish "update" case. + * Rule 2: For "delete" and "update", check that filter cols are also + * valid replica identity cols. */ - if (pub->pubactions.pubdelete) + if (pub->pubactions.pubdelete || pub->pubactions.pubupdate) { char replica_identity = rel->rd_rel->relreplident;
Fixed in v37-0006 [1]/messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 4, 2021 at 2:21 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
Thanks for the patches.
I started to review the patches and here are a few comments.1)
/*
* ALTER PUBLICATION ... ADD TABLE provides a PublicationTable List
* (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE provides
* a Relation List. Check the List element to be used.
*/
if (IsA(lfirst(lc), PublicationTable))
whereclause = true;
else
whereclause = false;I am not sure about the comments here, wouldn't it be better to always provides
PublicationTable List which could be more consistent.
Fixed in v37-0001 [1]/messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com.
2) + if ($3) + { + $$->pubtable->whereClause = $3; + }It seems we can remove the if ($3) check here.
Fixed in v37-0001 [1]/messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com.
3)
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext); + rfnode = stringToNode(TextDatumGetCString(rfdatum)); + exprstate = pgoutput_row_filter_init_expr(rfnode); + entry->exprstates = lappend(entry->exprstates, exprstate); + MemoryContextSwitchTo(oldctx); + }Currently in the patch, it save and execute each expression separately. I was
thinking it might be better if we can use "AND" to combine all the expressions
into one expression, then we can initialize and optimize the final expression
and execute it only once.
Yes, thanks for this suggestion - it is an interesting idea. I had
thought the same as this some time ago but never acted on it. I will
try implementing this idea as a separate new patch because it probably
needs to be performance tested against the current code just in case
the extra effort to combine the expressions outweighs any execution
benefits.
------
[1]: /messages/by-id/CAHut+PtRdXzPpm3qv3cEYWWfVUkGT84EopEHxwt95eo_cG_3eQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Fri, Nov 5, 2021 at 10:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
Few comments about changes made to the patch to rebase it:
1.
+#if 1
+ // FIXME - can we do a better job if integrating this with the schema changes
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
+ foreach(oldlc, oldrelids)
+ {
+ Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
+ }
+#else
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
I think for the correct merge you need to just call
CheckObjSchemaNotAlreadyInPublication() before this for loop. BTW, I
have a question regarding this implementation. Here, it has been
assumed that the new rel will always be specified with a different
qual, what if there is no qual or if the qual is the same?
2.
+preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t
yyscanner, bool alter_drop)
{
ListCell *cell;
PublicationObjSpec *pubobj;
@@ -17341,7 +17359,15 @@ preprocess_pubobj_list(List *pubobjspec_list,
core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ /* cannot use WHERE w-filter for DROP TABLE from publications */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause && alter_drop)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE"),
+ parser_errposition(pubobj->location));
+
This change looks a bit ad-hoc to me. Can we handle this at a later
point of time in publicationcmds.c?
3.
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
Again this doesn't appear to be the right way. I think this should be
handled at a later point.
--
With Regards,
Amit Kapila.
On Fri, Nov 5, 2021 1:14 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
Thanks for updating the patches.
Few comments:
1) v37-0001
I think it might be better to also show the filter expression in '\d+
tablename' command after publication description.
2) v37-0004
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
The 0004 patch currently use find_expr_references_walker to get all the
reference objects. I am thinking do we only need get the columns in the
expression ? I think maybe we can check the replica indentity like[1]rowfilter_expr_checker ... if (replica_identity == REPLICA_IDENTITY_DEFAULT) context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY); else context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);.
3) v37-0005
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
I think there could be other node type which can also be considered as simple
expression, for exmaple T_NullIfExpr.
Personally, I think it's natural to only check the IMMUTABLE and
whether-user-defined in the new function rowfilter_walker. We can keep the
other row-filter errors which were thrown for EXPR_KIND_PUBLICATION_WHERE in
the 0001 patch.
[1]: rowfilter_expr_checker ... if (replica_identity == REPLICA_IDENTITY_DEFAULT) context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY); else context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
rowfilter_expr_checker
...
if (replica_identity == REPLICA_IDENTITY_DEFAULT)
context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
else
context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
(void) rowfilter_expr_replident_walker(rfnode, &context);
...
static bool
rowfilter_expr_replident_walker(Node *node, rf_context *context)
{
if (node == NULL)
return false;
if (IsA(node, Var))
{
Oid relid = RelationGetRelid(context->rel);
Var *var = (Var *) node;
AttrNumber attnum = var->varattno - FirstLowInvalidHeapAttributeNumber;
if (!bms_is_member(attnum, context->bms_replident))
{
const char *colname = get_attname(relid, attnum, false);
ereport(ERROR,
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
errmsg("cannot add relation \"%s\" to publication",
RelationGetRelationName(context->rel)),
errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
colname)));
return false;
}
return true;
}
return expression_tree_walker(node, rowfilter_expr_replident_walker,
(void *) context);
}
Best regards,
Hou zj
On Friday, November 5, 2021 1:14 PM, Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
Thanks for your patch. I have a problem when using this patch.
The document about "create publication" in patch says:
The <literal>WHERE</literal> clause should contain only columns that are
part of the primary key or be covered by <literal>REPLICA
IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
be replicated.
But I tried this patch, the columns which could be contained in WHERE clause must be
covered by REPLICA IDENTITY, but it doesn't matter if they are part of the primary key.
(We can see it in Case 4 of publication.sql, too.) So maybe we should modify the document.
Regards
Tang
On Fri, Nov 5, 2021 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 5, 2021 at 10:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
3. - | ColId + | ColId OptWhereClause { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; - $$->name = $1; + if ($2) + { + $$->pubtable = makeNode(PublicationTable); $$->pubtable->relation = + makeRangeVar(NULL, $1, @1); $$->pubtable->whereClause = $2; } else { + $$->name = $1; }Again this doesn't appear to be the right way. I think this should be handled at
a later point.
I think the difficulty to handle this at a later point is that we need to make
sure we don't lose the whereclause. Currently, we can only save the whereclause
in PublicationTable structure and the PublicationTable is only used for TABLE,
but '| ColId' can be used for either a SCHEMA or TABLE. We cannot distinguish
the actual type at this stage, so we always need to save the whereclause if
it's NOT NULL.
I think the possible approaches to delay this check are:
(1) we can delete the PublicationTable structure and put all the vars(relation,
whereclause) in PublicationObjSpec. In this approach, we don't need check if
the whereclause is NULL in the '| ColId', we can check this at a later point.
Or
(2) Add a new pattern for whereclause in PublicationObjSpec:
The change could be:
PublicationObjSpec:
...
| ColId
...
+ | ColId WHERE '(' a_expr ')'
+ {
+ $$ = makeNode(PublicationObjSpec);
+ $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
In this approach, we also don't need the "if ($2)" check.
What do you think ?
Best regards,
Hou zj
PSA new set of v38* patches.
This addresses some review comments as follows:
v34-0001 = the "main" patch.
- rebased to HEAD
- fixed Amit review comment about ALTER DROP [1]Amit 5/11 #2 - /messages/by-id/CAA4eK1KN5gsTo6Qaomt-9vpC61cgw5ikgzLhOunf3o22G3uc_Q@mail.gmail.com
- fixed Houz review comment about psql \d+ [2]Houz 8/11 #1 - /messages/by-id/OS0PR01MB571625D4A5CC1DAB4045B2BB94919@OS0PR01MB5716.jpnprd01.prod.outlook.com
v34-0002 = tab auto-complete.
- not changed
v34-0003 = cache updates.
- fixed Houz review comment about combining multiple filters [3]Houz 4/11 #3 - /messages/by-id/OS0PR01MB5716090A70A73ADF58C58950948D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
v34-0004 = filter validation replica identity.
- fixed Tang review comment about REPLICA IDENTITY docs [4]Tang 9/11 - /messages/by-id/OS0PR01MB6113895D7964F03E9F57F9C7FB929@OS0PR01MB6113.jpnprd01.prod.outlook.com
v34-0005 = filter validation walker.
- not changed
v34-0006 = support old/new tuple logic for row-filters.
- not changed
------
[1]: Amit 5/11 #2 - /messages/by-id/CAA4eK1KN5gsTo6Qaomt-9vpC61cgw5ikgzLhOunf3o22G3uc_Q@mail.gmail.com
/messages/by-id/CAA4eK1KN5gsTo6Qaomt-9vpC61cgw5ikgzLhOunf3o22G3uc_Q@mail.gmail.com
[2]: Houz 8/11 #1 - /messages/by-id/OS0PR01MB571625D4A5CC1DAB4045B2BB94919@OS0PR01MB5716.jpnprd01.prod.outlook.com
/messages/by-id/OS0PR01MB571625D4A5CC1DAB4045B2BB94919@OS0PR01MB5716.jpnprd01.prod.outlook.com
[3]: Houz 4/11 #3 - /messages/by-id/OS0PR01MB5716090A70A73ADF58C58950948D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
/messages/by-id/OS0PR01MB5716090A70A73ADF58C58950948D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
[4]: Tang 9/11 - /messages/by-id/OS0PR01MB6113895D7964F03E9F57F9C7FB929@OS0PR01MB6113.jpnprd01.prod.outlook.com
/messages/by-id/OS0PR01MB6113895D7964F03E9F57F9C7FB929@OS0PR01MB6113.jpnprd01.prod.outlook.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v38-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v38-0003-PS-ExprState-cache-modifications.patchDownload
From 23101be0636866f4d8b9f09cb6d4929e4c9d3686 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 9 Nov 2021 17:59:36 +1100
Subject: [PATCH v38] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState *) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
If there are multiple publication filters for a given table these are are all
combined into a single filter.
Author: Peter Smith, Greg Nancarrow
Changes are based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
src/backend/replication/pgoutput/pgoutput.c | 229 +++++++++++++++++++---------
1 file changed, 154 insertions(+), 75 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 077ae18..f9fdbb0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1,4 +1,4 @@
-/*-------------------------------------------------------------------------
+/*------------------------------------------------------------------------
*
* pgoutput.c
* Logical Replication output plugin
@@ -21,6 +21,7 @@
#include "executor/executor.h"
#include "fmgr.h"
#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
#include "optimizer/optimizer.h"
#include "parser/parse_coerce.h"
#include "replication/logical.h"
@@ -123,7 +124,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means the exprstate * is correct -
+ * It doesn't mean that there actually is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +170,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +740,134 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ list_free(rfnodes);
+ }
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (!entry->exprstate)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -757,20 +880,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
/*
- * If the subscription has multiple publications and the same table has a
- * different row filter in these publications, all row filters must be
- * matched in order to replicate this change.
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
*/
- foreach(lc, entry->exprstate)
+ if (entry->exprstate)
{
- ExprState *exprstate = (ExprState *) lfirst(lc);
-
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
-
- /* If the tuple does not match one of the row filters, bail out */
- if (!result)
- break;
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
}
/* Cleanup allocated resources */
@@ -840,7 +956,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +989,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1023,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1321,10 +1437,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1344,7 +1461,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1358,22 +1474,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1383,9 +1483,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1449,33 +1546,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1582,6 +1652,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ free(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1622,12 +1707,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v38-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v38-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From dee5c3ca4330d9476afdfcec43254accefc8d093 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 9 Nov 2021 17:05:13 +1100
Subject: [PATCH v38] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f724e4..8c7fe7d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2757,10 +2765,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v38-0005-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v38-0005-PS-Row-filter-validation-walker.patchDownload
From e2ce74b6f5c0ab242e6462fa5827fdf3a1273c9a Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 10 Nov 2021 08:47:13 +1100
Subject: [PATCH v38] PS - Row filter validation walker.
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to modified validation messages and rules.
Author: Peter Smith
---
src/backend/catalog/dependency.c | 93 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 20 ++++---
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 2 +-
src/test/regress/expected/publication.out | 27 ++++++---
src/test/regress/sql/publication.sql | 13 ++++-
9 files changed, 151 insertions(+), 20 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 5566ffa..2916ead 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -133,6 +133,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1569,6 +1575,93 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * User-defined operators are not allowed.
+ * User-defined functions are not allowed.
+ * System functions that are not IMMUTABLE are not allowed.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* Find all the columns referenced by the row-filter expression and return them
* as a list of attribute numbers. This list is used for row-filter validation.
*/
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index e04e069..f56e01f 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -216,20 +216,26 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
/*
* Decide if the row-filter is valid according to the following rules:
*
- * Rule 1. If the publish operation contains "delete" then only columns that
+ * Rule 1. Walk the parse-tree and reject anything other than very simple
+ * expressions. (See rowfilter_validator for details what is permitted).
+ *
+ * Rule 2. If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
- *
- * Rule 2. TODO
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
Oid relid = RelationGetRelid(rel);
char *relname = RelationGetRelationName(rel);
/*
- * Rule 1: For "delete", check that filter cols are also valid replica
+ * Rule 1. Walk the parse-tree and reject anything unexpected.
+ */
+ rowfilter_validator(relname, rfnode);
+
+ /*
+ * Rule 2: For "delete", check that filter cols are also valid replica
* identity cols.
*
* TODO - check later for publish "update" case.
@@ -381,13 +387,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 2427321..e118256 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -153,6 +152,7 @@ extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
extern List *rowfilter_find_cols(Node *expr, Oid relId);
+extern void rowfilter_validator(char *relname, Node *expr);
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index d0a6e43..805c777 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -335,19 +335,29 @@ Tables:
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
DROP PUBLICATION testpub_syntax2;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
DROP TABLE testpub_rf_tbl1;
@@ -359,6 +369,7 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index eb1ee0d..5fc8fdf 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -172,13 +172,21 @@ CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschem
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
DROP TABLE testpub_rf_tbl1;
@@ -190,6 +198,7 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
--
1.8.3.1
v38-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v38-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 2f36f4ab5da9ca16b0081fa03962ccf79d37cbd7 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 9 Nov 2021 19:24:00 +1100
Subject: [PATCH v38] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code and PG docs.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
doc/src/sgml/ref/create_publication.sgml | 5 +-
src/backend/catalog/dependency.c | 55 ++++++++++++++++
src/backend/catalog/pg_publication.c | 77 +++++++++++++++++++++-
src/include/catalog/dependency.h | 2 +
src/test/regress/expected/publication.out | 105 +++++++++++++++++++++++++++---
src/test/regress/sql/publication.sql | 83 ++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +-
7 files changed, 314 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 52f6a1c..03cc956 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -231,8 +231,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
The <literal>WHERE</literal> clause should contain only columns that are
- part of the primary key or be covered by <literal>REPLICA
- IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> operations will not
be replicated. That's because old row is used and it only contains primary
key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index fe9c714..5566ffa 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1569,6 +1569,61 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Find all the columns referenced by the row-filter expression and return them
+ * as a list of attribute numbers. This list is used for row-filter validation.
+ */
+List *
+rowfilter_find_cols(Node *expr, Oid relId)
+{
+ find_expr_references_context context;
+ RangeTblEntry rte;
+ int ref;
+ List *rfcols = NIL;
+
+ context.addrs = new_object_addresses();
+
+ /* We gin up a rather bogus rangetable list to handle Vars */
+ MemSet(&rte, 0, sizeof(rte));
+ rte.type = T_RangeTblEntry;
+ rte.rtekind = RTE_RELATION;
+ rte.relid = relId;
+ rte.relkind = RELKIND_RELATION; /* no need for exactness here */
+ rte.rellockmode = AccessShareLock;
+
+ context.rtables = list_make1(list_make1(&rte));
+
+ /* Scan the expression tree for referenceable objects */
+ find_expr_references_walker(expr, &context);
+
+ /* Remove any duplicates */
+ eliminate_duplicate_dependencies(context.addrs);
+
+ /* Build/Return the list of columns referenced by this Row Filter */
+ for (ref = 0; ref < context.addrs->numrefs; ref++)
+ {
+ ObjectAddress *thisobj = context.addrs->refs + ref;
+
+ if (thisobj->classId == RelationRelationId)
+ {
+ AttrNumber attnum;
+
+ /*
+ * The parser already took care of ensuring columns must be from
+ * the correct table.
+ */
+ Assert(thisobj->objectId == relId);
+
+ attnum = thisobj->objectSubId;
+ rfcols = lappend_int(rfcols, attnum);
+ }
+ }
+
+ free_object_addresses(context.addrs);
+
+ return rfcols;
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 57d08e7..e04e069 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -214,9 +214,79 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
}
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Decide if the row-filter is valid according to the following rules:
+ *
+ * Rule 1. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
+ *
+ * Rule 2. TODO
*/
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ Oid relid = RelationGetRelid(rel);
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule 1: For "delete", check that filter cols are also valid replica
+ * identity cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ List *rfcols;
+ ListCell *lc;
+ Bitmapset *bms_okcols;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * Find what cols are referenced in the row filter WHERE clause,
+ * and validate that each of those referenced cols is allowed.
+ */
+ rfcols = rowfilter_find_cols(rfnode, relid);
+ foreach(lc, rfcols)
+ {
+ int attnum = lfirst_int(lc);
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_okcols))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ relname),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+
+ bms_free(bms_okcols);
+ list_free(rfcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,6 +385,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 3eca295..2427321 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -152,6 +152,8 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+extern List *rowfilter_find_cols(Node *expr, Oid relId);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index d5dd3a6..d0a6e43 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -246,13 +246,15 @@ CREATE TABLE testpub_rf_tbl4 (g text);
CREATE SCHEMA testpub_rf_myschema;
CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -262,7 +264,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -273,7 +275,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -284,7 +286,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -308,26 +310,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -357,6 +359,91 @@ DROP SCHEMA testpub_rf_myschema;
DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 75e011c..eb1ee0d 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -141,7 +141,9 @@ CREATE TABLE testpub_rf_tbl4 (g text);
CREATE SCHEMA testpub_rf_myschema;
CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -161,12 +163,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -189,6 +191,81 @@ DROP PUBLICATION testpub5;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index e806b5d..dff55c2 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v38-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v38-0001-Row-filter-for-logical-replication.patchDownload
From 0a3c633203a536374673f24bd745bb7e093db78a Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 9 Nov 2021 16:37:25 +1100
Subject: [PATCH v38] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 37 +++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 25 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 118 +++++++++++
src/test/regress/sql/publication.sql | 55 +++++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1068 insertions(+), 40 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ca01d8c..52f6a1c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +259,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +277,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fed83b8..57d08e7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -251,22 +254,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -283,10 +292,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -300,6 +329,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -316,6 +351,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..0a7d290 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,6 +529,28 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+#if 1
+ // FIXME - can we do a better job if integrating this with the schema changes
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
+ foreach(oldlc, oldrelids)
+ {
+ Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
+ }
+#else
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
@@ -565,6 +587,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
delrels = lappend(delrels, pubrel);
}
}
+#endif
/* And drop them. */
PublicationDropTables(pubid, delrels, true);
@@ -931,7 +954,9 @@ OpenTableList(List *tables)
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
@@ -966,7 +991,10 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -993,6 +1021,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1042,7 +1072,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -1088,6 +1118,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index ad1ea2f..e3b0039 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4830,6 +4830,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index a6d0cef..8ca7f15
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,39 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17341,7 +17353,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..077ae18 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1297,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1141,6 +1323,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1160,6 +1344,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1173,6 +1358,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1182,6 +1383,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1245,9 +1449,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1365,6 +1593,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1603,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1391,7 +1622,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7e98371..b404fd2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4229,6 +4229,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4239,9 +4240,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4250,6 +4258,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4290,6 +4299,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4360,8 +4373,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..964c204 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +124,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2ff21a7..d5dd3a6 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,124 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP SCHEMA testpub_rf_myschema;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 85a5302..75e011c 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,61 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP SCHEMA testpub_rf_myschema;
+DROP PUBLICATION testpub5;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..e806b5d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v38-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v38-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 2d6376a59731ceffa5ad5acc182ae122b948f1c6 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 10 Nov 2021 09:52:36 +1100
Subject: [PATCH v38] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/catalog/pg_publication.c | 6 +-
src/backend/replication/logical/proto.c | 122 ++++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 252 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
6 files changed, 368 insertions(+), 26 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index f56e01f..44bc4da 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -235,12 +235,10 @@ rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relat
rowfilter_validator(relname, rfnode);
/*
- * Rule 2: For "delete", check that filter cols are also valid replica
+ * Rule 2: For "delete" and "update", check that filter cols are also valid replica
* identity cols.
- *
- * TODO - check later for publish "update" case.
*/
- if (pub->pubactions.pubdelete)
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
{
char replica_identity = rel->rd_rel->relreplident;
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b14340 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,38 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +781,93 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index f9fdbb0..0d82736 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -133,7 +133,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -168,10 +171,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -735,17 +744,102 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If it returns true, the change is to be replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update require a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity colums changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -763,7 +857,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -772,9 +866,31 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
+
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -860,6 +976,66 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -890,7 +1066,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -948,6 +1123,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -956,7 +1134,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -987,9 +1165,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1012,8 +1191,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->new_tuple != NULL && !TTS_EMPTY(relentry->new_tuple))
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1023,7 +1223,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1441,6 +1641,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
@@ -1662,6 +1865,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
if (entry->exprstate != NULL)
{
free(entry->exprstate);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dff55c2..3fc503f 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Fri, Nov 5, 2021 at 7:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
2. +preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner, bool alter_drop) { ListCell *cell; PublicationObjSpec *pubobj; @@ -17341,7 +17359,15 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner) errcode(ERRCODE_SYNTAX_ERROR), errmsg("invalid table name at or near"), parser_errposition(pubobj->location)); - else if (pubobj->name) + + /* cannot use WHERE w-filter for DROP TABLE from publications */ + if (pubobj->pubtable && pubobj->pubtable->whereClause && alter_drop) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE"), + parser_errposition(pubobj->location)); +This change looks a bit ad-hoc to me. Can we handle this at a later
point of time in publicationcmds.c?
Fixed in v38-0001 [1]/messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com.
------
[1]: /messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Fri, Nov 5, 2021 1:14 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
Thanks for updating the patches.
Few comments:1) v37-0001
I think it might be better to also show the filter expression in '\d+
tablename' command after publication description.
Fixed in v38-0001 [1]/messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Austrlalia
On Tue, Nov 9, 2021 at 2:03 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Friday, November 5, 2021 1:14 PM, Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
Thanks for your patch. I have a problem when using this patch.
The document about "create publication" in patch says:
The <literal>WHERE</literal> clause should contain only columns that are
part of the primary key or be covered by <literal>REPLICA
IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
be replicated.But I tried this patch, the columns which could be contained in WHERE clause must be
covered by REPLICA IDENTITY, but it doesn't matter if they are part of the primary key.
(We can see it in Case 4 of publication.sql, too.) So maybe we should modify the document.
PG Docs is changed in v38-0004 [1]/messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com. Please check if it is OK.
------
[1]: /messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 4, 2021 at 2:21 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
3)
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext); + rfnode = stringToNode(TextDatumGetCString(rfdatum)); + exprstate = pgoutput_row_filter_init_expr(rfnode); + entry->exprstates = lappend(entry->exprstates, exprstate); + MemoryContextSwitchTo(oldctx); + }Currently in the patch, it save and execute each expression separately. I was
thinking it might be better if we can use "AND" to combine all the expressions
into one expression, then we can initialize and optimize the final expression
and execute it only once.
Fixed in v38-0003 [1]/messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com.
------
[1]: /messages/by-id/CAHut+PvWCS+W_OLV60AZJucY1RFpkXS=hfvYWwpwyMvifdJxiQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thur, Nov 4, 2021 10:47 AM Peter Smith <smithpb2250@gmail.com> wrote:
PROPOSAL
I propose that we change the way duplicate tables are processed to make it so
that it is always the *last* one that takes effect (instead of the *first* one). AFAIK
doing this won't affect any current PG behaviour, but doing this will let the new
row-filter feature work in a consistent/predictable/sane way.Thoughts?
Last one take effect sounds reasonable to me.
OTOH, I think we should make the behavior here consistent with Column Filter
Patch in another thread. IIRC, in the current column filter patch, only the
first one's filter takes effect. So, maybe better to get Rahila and Alvaro's
thoughts on this.
Best regards,
Hou zj
On Tue, Nov 9, 2021 at 2:22 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Fri, Nov 5, 2021 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 5, 2021 at 10:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
3. - | ColId + | ColId OptWhereClause { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; - $$->name = $1; + if ($2) + { + $$->pubtable = makeNode(PublicationTable); $$->pubtable->relation = + makeRangeVar(NULL, $1, @1); $$->pubtable->whereClause = $2; } else { + $$->name = $1; }Again this doesn't appear to be the right way. I think this should be handled at
a later point.I think the difficulty to handle this at a later point is that we need to make
sure we don't lose the whereclause. Currently, we can only save the whereclause
in PublicationTable structure and the PublicationTable is only used for TABLE,
but '| ColId' can be used for either a SCHEMA or TABLE. We cannot distinguish
the actual type at this stage, so we always need to save the whereclause if
it's NOT NULL.
I see your point. But, I think we can add some comments here
indicating that the user might have mistakenly given where clause with
some schema which we will identify later and give an appropriate
error. Then, in preprocess_pubobj_list(), identify if the user has
given the where clause with schema name and give an appropriate error.
I think the possible approaches to delay this check are:
(1) we can delete the PublicationTable structure and put all the vars(relation,
whereclause) in PublicationObjSpec. In this approach, we don't need check if
the whereclause is NULL in the '| ColId', we can check this at a later point.
Yeah, we can do this but I don't think it will reduce any checks later
to identify if the user has given where clause only for tables. So,
let's keep this structure around as that will at least keep all things
related to the table together in one structure.
Or
(2) Add a new pattern for whereclause in PublicationObjSpec:
The change could be:
PublicationObjSpec: ... | ColId ... + | ColId WHERE '(' a_expr ')' + { + $$ = makeNode(PublicationObjSpec); + $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; + $$->pubtable = makeNode(PublicationTable); + $$->pubtable->relation = makeRangeVar(NULL, $1, @1); + $$->pubtable->whereClause = $2; + }In this approach, we also don't need the "if ($2)" check.
This seems redundant and we still need same checks later to see if the
where clause is given with the table object.
--
With Regards,
Amit Kapila.
On Wed, Nov 10, 2021 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 9, 2021 at 2:22 PM houzj.fnst@fujitsu.com wrote:
On Fri, Nov 5, 2021 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 5, 2021 at 10:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
3. - | ColId + | ColId OptWhereClause { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; - $$->name = $1; + if ($2) + { + $$->pubtable = makeNode(PublicationTable); $$->pubtable->relation + = makeRangeVar(NULL, $1, @1); $$->pubtable->whereClause = $2; } + else { $$->name = $1; }Again this doesn't appear to be the right way. I think this should
be handled at a later point.I think the difficulty to handle this at a later point is that we need
to make sure we don't lose the whereclause. Currently, we can only
save the whereclause in PublicationTable structure and the
PublicationTable is only used for TABLE, but '| ColId' can be used for
either a SCHEMA or TABLE. We cannot distinguish the actual type at
this stage, so we always need to save the whereclause if it's NOT NULL.I see your point. But, I think we can add some comments here indicating that
the user might have mistakenly given where clause with some schema which we
will identify later and give an appropriate error. Then, in
preprocess_pubobj_list(), identify if the user has given the where clause with
schema name and give an appropriate error.
OK, IIRC, in this approach, we need to set both $$->name and $$->pubtable in
'| ColId OptWhereClause'. And In preprocess_pubobj_list, we can add some check
if both name and pubtable is NOT NULL.
the grammar code could be:
| ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->name = $1;
+ /* xxx */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
$$->location = @1;
}
preprocess_pubobj_list
...
else if (pubobj->pubobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_CURRSCHEMA)
{
...
+ if (pubobj->name &&
+ (!pubobj->pubtable || !pubobj->pubtable->whereClause))
pubobj->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA;
else if (!pubobj->name && !pubobj->pubtable)
pubobj->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA;
else
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid schema name at or near"),
parser_errposition(pubobj->location));
}
Best regards,
Hou zj
On Wed, Nov 10, 2021 at 4:57 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Wed, Nov 10, 2021 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 9, 2021 at 2:22 PM houzj.fnst@fujitsu.com wrote:
On Fri, Nov 5, 2021 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 5, 2021 at 10:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v37* patches.
3. - | ColId + | ColId OptWhereClause { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; - $$->name = $1; + if ($2) + { + $$->pubtable = makeNode(PublicationTable); $$->pubtable->relation + = makeRangeVar(NULL, $1, @1); $$->pubtable->whereClause = $2; } + else { $$->name = $1; }Again this doesn't appear to be the right way. I think this should
be handled at a later point.I think the difficulty to handle this at a later point is that we need
to make sure we don't lose the whereclause. Currently, we can only
save the whereclause in PublicationTable structure and the
PublicationTable is only used for TABLE, but '| ColId' can be used for
either a SCHEMA or TABLE. We cannot distinguish the actual type at
this stage, so we always need to save the whereclause if it's NOT NULL.I see your point. But, I think we can add some comments here indicating that
the user might have mistakenly given where clause with some schema which we
will identify later and give an appropriate error. Then, in
preprocess_pubobj_list(), identify if the user has given the where clause with
schema name and give an appropriate error.OK, IIRC, in this approach, we need to set both $$->name and $$->pubtable in
'| ColId OptWhereClause'. And In preprocess_pubobj_list, we can add some check
if both name and pubtable is NOT NULL.the grammar code could be:
| ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;$$->name = $1; + /* xxx */ + $$->pubtable = makeNode(PublicationTable); + $$->pubtable->relation = makeRangeVar(NULL, $1, @1); + $$->pubtable->whereClause = $2; $$->location = @1; }preprocess_pubobj_list ... else if (pubobj->pubobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA || pubobj->pubobjtype == PUBLICATIONOBJ_CURRSCHEMA) { ... + if (pubobj->name && + (!pubobj->pubtable || !pubobj->pubtable->whereClause)) pubobj->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA; else if (!pubobj->name && !pubobj->pubtable) pubobj->pubobjtype = PUBLICATIONOBJ_CURRSCHEMA; else ereport(ERROR, errcode(ERRCODE_SYNTAX_ERROR), errmsg("invalid schema name at or near"), parser_errposition(pubobj->location)); }
Hi Hou-san. Actually, I have already implemented this part according
to my understanding of Amit's suggestion and it seems to be working
well.
Please wait for v39-0001, then feel free to post review comments about
it if you think there are still problems.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
3) v37-0005
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
I think there could be other node type which can also be considered as simple
expression, for exmaple T_NullIfExpr.
The current walker restrictions are from a previously agreed decision
by Amit/Tomas [1]/messages/by-id/CAA4eK1+XoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ@mail.gmail.com and from an earlier suggestion from Andres [2]/messages/by-id/20210128022032.eq2qqc6zxkqn5syt@alap3.anarazel.de to
keep everything very simple for a first version.
Yes, you are right, there might be some additional node types that
might be fine, but at this time I don't want to add anything different
without getting their approval to do so. Anyway, additions like this
are all candidates for a future version of this row-filter feature.
Personally, I think it's natural to only check the IMMUTABLE and
whether-user-defined in the new function rowfilter_walker. We can keep the
other row-filter errors which were thrown for EXPR_KIND_PUBLICATION_WHERE in
the 0001 patch.
YMMV. IMO it is much more convenient for all the filter validations to
be centralized just in one walker function instead of scattered all
over the place like they were in the 0001 patch.
-----
[1]: /messages/by-id/CAA4eK1+XoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ@mail.gmail.com
[2]: /messages/by-id/20210128022032.eq2qqc6zxkqn5syt@alap3.anarazel.de
Kind Regards,
Peter Smith.
Fujitsu Australia
Attaching version 39-
V39 fixes the following review comments:
On Fri, Nov 5, 2021 at 7:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);I think for the correct merge you need to just call
CheckObjSchemaNotAlreadyInPublication() before this for loop. BTW, I
have a question regarding this implementation. Here, it has been
assumed that the new rel will always be specified with a different
qual, what if there is no qual or if the qual is the same?
Actually with this code, no qual or a different qual does not matter,
it recreates everything as specified by the ALTER SET command.
I have added CheckObjSchemaNotAlreadyInPublication as you specified since this
is required to match the schema patch behaviour. I've also added
a test case that tests this particular case.
On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
2) v37-0004
+ /* Scan the expression tree for referenceable objects */ + find_expr_references_walker(expr, &context); + + /* Remove any duplicates */ + eliminate_duplicate_dependencies(context.addrs); +The 0004 patch currently use find_expr_references_walker to get all the
reference objects. I am thinking do we only need get the columns in the
expression ? I think maybe we can check the replica indentity like[1].
Changed as suggested.
On Thu, Nov 4, 2021 at 2:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I see your point. But, I think we can add some comments here
indicating that the user might have mistakenly given where clause with
some schema which we will identify later and give an appropriate
error. Then, in preprocess_pubobj_list(), identify if the user has
given the where clause with schema name and give an appropriate erro
Changed as suggested.
On Thu, Nov 4, 2021 at 2:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
BTW, why not give an error if the duplicate table is present and any one of them or
both have row-filters? I think the current behavior makes sense
because it makes no difference if the table is present more than once
in the list but with row-filter it can make difference so it seems to
me that giving an error should be considered.
Changed as suggested, also added test cases for the same.
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v39-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v39-0001-Row-filter-for-logical-replication.patchDownload
From 5c9a5f8213e0ef6215fe7229bb91535354d302e0 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Thu, 11 Nov 2021 04:40:23 -0500
Subject: [PATCH v39 1/6] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 79 +++++---
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 +++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++++
src/test/regress/sql/publication.sql | 75 +++++++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1143 insertions(+), 70 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ca01d8c..52f6a1c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +259,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +277,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fed83b8..57d08e7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -251,22 +254,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -283,10 +292,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -300,6 +329,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -316,6 +351,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..00a0a58 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,41 +529,30 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
-
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -899,6 +888,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +916,31 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -966,7 +972,10 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +983,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1003,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1042,7 +1054,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -1088,6 +1100,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index ad1ea2f..e3b0039 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4830,6 +4830,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index a6d0cef..f480846
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17341,7 +17359,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17354,6 +17373,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..077ae18 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1297,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1141,6 +1323,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1160,6 +1344,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1173,6 +1358,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1182,6 +1383,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1245,9 +1449,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1365,6 +1593,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1603,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1391,7 +1622,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7e98371..b404fd2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4229,6 +4229,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4239,9 +4240,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4250,6 +4258,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4290,6 +4299,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4360,8 +4373,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..964c204 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +124,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2ff21a7..9e7f81d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 85a5302..21cc923 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..e806b5d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v39-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v39-0003-PS-ExprState-cache-modifications.patchDownload
From be6758cbf52f4587d2676bdc41e9198aeeacf240 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Thu, 11 Nov 2021 04:57:18 -0500
Subject: [PATCH v39 3/6] PS - ExprState cache modifications.
Now the cached row-filter caches (e.g. ExprState *) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
If there are multiple publication filters for a given table these are are all
combined into a single filter.
Author: Peter Smith, Greg Nancarrow
Changes are based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
src/backend/replication/pgoutput/pgoutput.c | 229 +++++++++++++++++++---------
1 file changed, 154 insertions(+), 75 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 077ae18..f9fdbb0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1,4 +1,4 @@
-/*-------------------------------------------------------------------------
+/*------------------------------------------------------------------------
*
* pgoutput.c
* Logical Replication output plugin
@@ -21,6 +21,7 @@
#include "executor/executor.h"
#include "fmgr.h"
#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
#include "optimizer/optimizer.h"
#include "parser/parse_coerce.h"
#include "replication/logical.h"
@@ -123,7 +124,15 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' only means the exprstate * is correct -
+ * It doesn't mean that there actually is any row filter present for the
+ * current relid.
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +170,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +740,134 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ bool am_partition = get_rel_relispartition(relid);
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ Oid pub_relid = relid;
+
+ if (pub->pubviaroot && am_partition)
+ {
+ if (pub->alltables)
+ pub_relid = llast_oid(get_partition_ancestors(relid));
+ else
+ {
+ List *ancestors = get_partition_ancestors(relid);
+ ListCell *lc2;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc2, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc2);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ pub->oid))
+ {
+ pub_relid = ancestor;
+ }
+ }
+ }
+ }
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(pub_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ list_free(rfnodes);
+ }
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (!entry->exprstate)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -757,20 +880,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
/*
- * If the subscription has multiple publications and the same table has a
- * different row filter in these publications, all row filters must be
- * matched in order to replicate this change.
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
*/
- foreach(lc, entry->exprstate)
+ if (entry->exprstate)
{
- ExprState *exprstate = (ExprState *) lfirst(lc);
-
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
-
- /* If the tuple does not match one of the row filters, bail out */
- if (!result)
- break;
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
}
/* Cleanup allocated resources */
@@ -840,7 +956,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +989,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1023,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1321,10 +1437,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1344,7 +1461,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1358,22 +1474,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1383,9 +1483,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1449,33 +1546,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1582,6 +1652,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ free(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1622,12 +1707,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v39-0004-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v39-0004-PS-Row-filter-validation-of-replica-identity.patchDownload
From 0bbe943e5d986ff3f3d84019e9b1d5648c0cafa5 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Thu, 11 Nov 2021 22:27:49 -0500
Subject: [PATCH v39 4/6] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code and PG docs.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
doc/src/sgml/ref/create_publication.sgml | 5 +-
src/backend/catalog/pg_publication.c | 97 ++++++++++++++++++++++++++-
src/test/regress/expected/publication.out | 105 +++++++++++++++++++++++++++---
src/test/regress/sql/publication.sql | 83 ++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +-
5 files changed, 277 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 52f6a1c..03cc956 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -231,8 +231,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
The <literal>WHERE</literal> clause should contain only columns that are
- part of the primary key or be covered by <literal>REPLICA
- IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> operations will not
be replicated. That's because old row is used and it only contains primary
key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 57d08e7..eadb6d0 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,7 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -213,10 +214,99 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+typedef struct {
+ Relation rel;
+ Bitmapset *bms_replident;
+}
+rf_context;
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Walk the row-filter expression to find check that all the referenced columns
+ * are permitted, else error.
*/
+static bool
+rowfilter_expr_replident_walker(Node *node, rf_context *context)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+
+ return true;
+ }
+
+ return expression_tree_walker(node, rowfilter_expr_replident_walker,
+ (void *) context);
+}
+
+/*
+ * Decide if the row-filter is valid according to the following rules:
+ *
+ * Rule 1. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
+ *
+ * Rule 2. TODO
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+{
+ /*
+ * Rule 1: For "delete", check that filter cols are also valid replica
+ * identity cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ Bitmapset *bms_okcols;
+ rf_context context = {0};
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ context.rel = rel;
+ context.bms_replident = bms_okcols;
+ (void) rowfilter_expr_replident_walker(rfnode, &context);
+
+ bms_free(bms_okcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,6 +405,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 9e7f81d..5e7fe01 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -387,6 +389,91 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 21cc923..b127605 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -209,6 +211,81 @@ DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index e806b5d..dff55c2 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -223,9 +225,7 @@ $node_publisher->wait_for_catchup($appname);
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -234,7 +234,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v39-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v39-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From fd64b735e10021b773fb21d4f23f4dce0a2beced Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Thu, 11 Nov 2021 04:44:46 -0500
Subject: [PATCH v39 2/6] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f724e4..8c7fe7d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2757,10 +2765,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v39-0005-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v39-0005-PS-Row-filter-validation-walker.patchDownload
From bf26465f706a4077659a9b5a04f74dcb514c2ef8 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Fri, 12 Nov 2021 01:19:35 -0500
Subject: [PATCH v39 5/6] PS - Row filter validation walker.
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to modified validation messages and rules.
Author: Peter Smith
---
src/backend/catalog/dependency.c | 93 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 22 +++++---
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 3 +-
src/test/regress/expected/publication.out | 26 ++++++---
src/test/regress/sql/publication.sql | 12 +++-
9 files changed, 154 insertions(+), 18 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index fe9c714..ba6de0a 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -133,6 +133,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1569,6 +1575,93 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * User-defined operators are not allowed.
+ * User-defined functions are not allowed.
+ * System functions that are not IMMUTABLE are not allowed.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index eadb6d0..3a6f7a6 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -258,17 +258,25 @@ rowfilter_expr_replident_walker(Node *node, rf_context *context)
/*
* Decide if the row-filter is valid according to the following rules:
*
- * Rule 1. If the publish operation contains "delete" then only columns that
+ * Rule 1. Walk the parse-tree and reject anything other than very simple
+ * expressions (See rowfilter_validator for details on what is permitted).
+ *
+ * Rule 2. If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
- *
- * Rule 2. TODO
*/
static void
-rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
+rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
{
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule 1. Walk the parse-tree and reject anything unexpected.
+ */
+ rowfilter_validator(relname, rfnode);
+
/*
- * Rule 1: For "delete", check that filter cols are also valid replica
+ * Rule 2: For "delete", check that filter cols are also valid replica
* identity cols.
*
* TODO - check later for publish "update" case.
@@ -401,13 +409,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
/* Validate the row-filter. */
- rowfilter_expr_checker(pub, whereclause, targetrel);
+ rowfilter_expr_checker(pub, pstate, whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 3eca295..ab9bfe3 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -152,6 +151,8 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+extern void rowfilter_validator(char *relname, Node *expr);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5e7fe01..1f51560 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -356,18 +356,29 @@ CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
-- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -389,6 +400,7 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index b127605..d59a08e 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -185,12 +185,21 @@ CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
-- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -210,6 +219,7 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
--
1.8.3.1
v39-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v39-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 88513a96da650ae0af7162fcb124c0fab7280e7a Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Fri, 12 Nov 2021 04:54:10 -0500
Subject: [PATCH v39 6/6] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/catalog/pg_publication.c | 12 +-
src/backend/replication/logical/proto.c | 120 +++++++++++++
src/backend/replication/pgoutput/pgoutput.c | 251 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 4 +
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
6 files changed, 368 insertions(+), 29 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 3a6f7a6..876b47c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -261,9 +261,9 @@ rowfilter_expr_replident_walker(Node *node, rf_context *context)
* Rule 1. Walk the parse-tree and reject anything other than very simple
* expressions (See rowfilter_validator for details on what is permitted).
*
- * Rule 2. If the publish operation contains "delete" then only columns that
- * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
- * row-filter WHERE clause.
+ * Rule 2. If the publish operation contains "delete" or "delete" then only
+ * columns that are allowed by the REPLICA IDENTITY rules are permitted to
+ * be used in the row-filter WHERE clause.
*/
static void
rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel)
@@ -276,12 +276,10 @@ rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relat
rowfilter_validator(relname, rfnode);
/*
- * Rule 2: For "delete", check that filter cols are also valid replica
+ * Rule 2: For "delete" and "update", check that filter cols are also valid replica
* identity cols.
- *
- * TODO - check later for publish "update" case.
*/
- if (pub->pubactions.pubdelete)
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
{
char replica_identity = rel->rd_rel->relreplident;
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..aac334d 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -32,6 +33,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
HeapTuple tuple, bool binary);
+static void logicalrep_write_tuple_cached(StringInfo out, Relation rel,
+ TupleTableSlot *slot, bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -438,6 +441,37 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
}
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple, bool binary)
+{
+ pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
+
+ Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
+ rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
+
+ /* transaction ID (if not valid, we're not streaming) */
+ if (TransactionIdIsValid(xid))
+ pq_sendint32(out, xid);
+
+ /* use Oid as relation identifier */
+ pq_sendint32(out, RelationGetRelid(rel));
+
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ pq_sendbyte(out, 'O'); /* old tuple follows */
+ else
+ pq_sendbyte(out, 'K'); /* old key follows */
+ logicalrep_write_tuple_cached(out, rel, oldtuple, binary);
+
+ pq_sendbyte(out, 'N'); /* new tuple follows */
+ logicalrep_write_tuple_cached(out, rel, newtuple, binary);
+}
+
+/*
* Write UPDATE to the output stream.
*/
void
@@ -746,6 +780,92 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)
+{
+ TupleDesc desc;
+ int i;
+ uint16 nliveatts = 0;
+ HeapTuple tuple = ExecFetchSlotHeapTuple(slot, false, NULL);
+
+ desc = RelationGetDescr(rel);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ if (TupleDescAttr(desc, i)->attisdropped || TupleDescAttr(desc, i)->attgenerated)
+ continue;
+ nliveatts++;
+ }
+ pq_sendint16(out, nliveatts);
+
+ /* try to allocate enough memory from the get-go */
+ enlargeStringInfo(out, tuple->t_len +
+ nliveatts * (1 + 4));
+
+ /* Write the values */
+ for (i = 0; i < desc->natts; i++)
+ {
+ HeapTuple typtup;
+ Form_pg_type typclass;
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ if (att->attisdropped || att->attgenerated)
+ continue;
+
+ if (slot->tts_isnull[i])
+ {
+ pq_sendbyte(out, LOGICALREP_COLUMN_NULL);
+ continue;
+ }
+
+ if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(slot->tts_values[i]))
+ {
+ /*
+ * Unchanged toasted datum. (Note that we don't promise to detect
+ * unchanged data in general; this is just a cheap check to avoid
+ * sending large values unnecessarily.)
+ */
+ pq_sendbyte(out, LOGICALREP_COLUMN_UNCHANGED);
+ continue;
+ }
+
+ typtup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(att->atttypid));
+ if (!HeapTupleIsValid(typtup))
+ elog(ERROR, "cache lookup failed for type %u", att->atttypid);
+ typclass = (Form_pg_type) GETSTRUCT(typtup);
+
+ /*
+ * Send in binary if requested and type has suitable send function.
+ */
+ if (binary && OidIsValid(typclass->typsend))
+ {
+ bytea *outputbytes;
+ int len;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_BINARY);
+ outputbytes = OidSendFunctionCall(typclass->typsend, slot->tts_values[i]);
+ len = VARSIZE(outputbytes) - VARHDRSZ;
+ pq_sendint(out, len, 4); /* length */
+ pq_sendbytes(out, VARDATA(outputbytes), len); /* data */
+ pfree(outputbytes);
+ }
+ else
+ {
+ char *outputstr;
+
+ pq_sendbyte(out, LOGICALREP_COLUMN_TEXT);
+ outputstr = OidOutputFunctionCall(typclass->typoutput, slot->tts_values[i]);
+ pq_sendcountedtext(out, outputstr, strlen(outputstr), false);
+ pfree(outputstr);
+ }
+
+ ReleaseSysCache(typtup);
+ }
+}
+
+/*
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index f9fdbb0..e7f2fd4 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -133,7 +133,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -168,10 +171,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -735,17 +744,104 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new row (match) -> UPDATE
+ * old-row (no match) new-row (no match) -> (drop change)
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -763,7 +859,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -772,9 +868,28 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -860,6 +975,66 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -890,7 +1065,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -948,6 +1122,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -956,7 +1133,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -987,9 +1164,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1012,8 +1190,29 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ if (relentry->new_tuple != NULL && !TTS_EMPTY(relentry->new_tuple))
+ logicalrep_write_update_cached(ctx->out, xid, relation,
+ relentry->old_tuple, relentry->new_tuple, data->binary);
+ else
+ logicalrep_write_update(ctx->out, xid, relation, oldtuple,
+ newtuple, data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1023,7 +1222,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1441,6 +1640,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
@@ -1662,6 +1864,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
if (entry->exprstate != NULL)
{
free(entry->exprstate);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..ba71f3f 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -212,6 +213,9 @@ extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
HeapTuple newtuple, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index dff55c2..3fc503f 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -220,7 +220,8 @@ $node_publisher->wait_for_catchup($appname);
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -232,7 +233,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
--
1.8.3.1
On Fri, Nov 12, 2021 at 9:19 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
Here are some review comments for v39-0006:
1)
@@ -261,9 +261,9 @@ rowfilter_expr_replident_walker(Node *node,
rf_context *context)
* Rule 1. Walk the parse-tree and reject anything other than very simple
* expressions (See rowfilter_validator for details on what is permitted).
*
- * Rule 2. If the publish operation contains "delete" then only columns that
- * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
- * row-filter WHERE clause.
+ * Rule 2. If the publish operation contains "delete" or "delete" then only
+ * columns that are allowed by the REPLICA IDENTITY rules are permitted to
+ * be used in the row-filter WHERE clause.
*/
static void
rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node
*rfnode, Relation rel)
@@ -276,12 +276,10 @@ rowfilter_expr_checker(Publication *pub,
ParseState *pstate, Node *rfnode, Relat
rowfilter_validator(relname, rfnode);
/*
- * Rule 2: For "delete", check that filter cols are also valid replica
+ * Rule 2: For "delete" and "update", check that filter cols are also
valid replica
* identity cols.
- *
- * TODO - check later for publish "update" case.
*/
- if (pub->pubactions.pubdelete)
1a)
Typo - the function comment: "delete" or "delete"; should say:
"delete" or "update"
1b)
I felt it would be better (for the comment in the function body) to
write it as "or" instead of "and" because then it matches with the
code "if ||" that follows this comment.
====
2)
@@ -746,6 +780,92 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}
/*
+ * Write a tuple to the outputstream using cached slot, in the most
efficient format possible.
+ */
+static void
+logicalrep_write_tuple_cached(StringInfo out, Relation rel,
TupleTableSlot *slot, bool binary)
The function logicalrep_write_tuple_cached seems to have almost all of
its function body in common with logicalrep_write_tuple. Is there any
good way to combine these functions to avoid ~80 lines mostly
duplicated code?
====
3)
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && !old_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+
+ return true;
I felt it is slightly confusing to have inconsistent ordering of the
old_matched and new_matched in those above conditions.
I suggest to use the order like:
* old-row (no match) new-row (no match)
* old-row (no match) new row (match)
* old-row (match) new-row (no match)
* old-row (match) new row (match)
And then be sure to keep consistent ordering in all places it is mentioned:
* in the code
* in the function header comment
* in the commit comment
* in docs?
====
4)
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
It seems like that elog may consume unnecessary CPU most of the time.
I think it might be better to remove the relid declaration and rewrite
that elog as:
if (message_level_is_interesting(DEBUG3))
elog(DEBUG3, "table \"%s.%s\" has row filter",
get_namespace_name(get_rel_namespace(entry->relid)),
get_rel_name(entry->relid));
====
5)
diff --git a/src/include/replication/reorderbuffer.h
b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
This new typedef can be added to src/tools/pgindent/typedefs.list.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wednesday, November 10, 2021 7:46 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Nov 9, 2021 at 2:03 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Friday, November 5, 2021 1:14 PM, Peter Smith <smithpb2250@gmail.com>
wrote:
PSA new set of v37* patches.
Thanks for your patch. I have a problem when using this patch.
The document about "create publication" in patch says:
The <literal>WHERE</literal> clause should contain only columns that are
part of the primary key or be covered by <literal>REPLICA
IDENTITY</literal> otherwise, <command>DELETE</command> operations willnot
be replicated.
But I tried this patch, the columns which could be contained in WHERE clause
must be
covered by REPLICA IDENTITY, but it doesn't matter if they are part of the
primary key.
(We can see it in Case 4 of publication.sql, too.) So maybe we should modify the
document.
PG Docs is changed in v38-0004 [1]. Please check if it is OK.
Thanks, this change looks good to me.
Regards
Tang
On Friday, November 12, 2021 6:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
Thanks for the new patch.
I met a problem when using "ALTER PUBLICATION ... SET TABLE ... WHERE ...", the
publisher was crashed after executing this statement.
Here is some information about this problem.
Steps to reproduce:
-- publisher
create table t(a int primary key, b int);
create publication pub for table t where (a>5);
-- subscriber
create table t(a int primary key, b int);
create subscription sub connection 'dbname=postgres port=5432' publication pub;
-- publisher
insert into t values (1, 2);
alter publication pub set table t where (a>7);
Publisher log:
2021-11-15 13:36:54.997 CST [3319891] LOG: logical decoding found consistent point at 0/15208B8
2021-11-15 13:36:54.997 CST [3319891] DETAIL: There are no running transactions.
2021-11-15 13:36:54.997 CST [3319891] STATEMENT: START_REPLICATION SLOT "sub" LOGICAL 0/0 (proto_version '3', publication_names '"pub"')
double free or corruption (out)
2021-11-15 13:36:55.072 CST [3319746] LOG: received fast shutdown request
2021-11-15 13:36:55.073 CST [3319746] LOG: aborting any active transactions
2021-11-15 13:36:55.105 CST [3319746] LOG: background worker "logical replication launcher" (PID 3319874) exited with exit code 1
2021-11-15 13:36:55.105 CST [3319869] LOG: shutting down
2021-11-15 13:36:55.554 CST [3319746] LOG: server process (PID 3319891) was terminated by signal 6: Aborted
2021-11-15 13:36:55.554 CST [3319746] DETAIL: Failed process was running: START_REPLICATION SLOT "sub" LOGICAL 0/0 (proto_version '3', publication_names '"pub"')
2021-11-15 13:36:55.554 CST [3319746] LOG: terminating any other active server processes
Backtrace is attached. I think maybe the problem is related to the below change in 0003 patch:
+ free(entry->exprstate);
Regards
Tang
Attachments:
On Fri, Nov 12, 2021 at 3:49 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
V39 fixes the following review comments:
On Fri, Nov 5, 2021 at 7:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);I think for the correct merge you need to just call
CheckObjSchemaNotAlreadyInPublication() before this for loop. BTW, I
have a question regarding this implementation. Here, it has been
assumed that the new rel will always be specified with a different
qual, what if there is no qual or if the qual is the same?Actually with this code, no qual or a different qual does not matter,
it recreates everything as specified by the ALTER SET command.
I have added CheckObjSchemaNotAlreadyInPublication as you specified since this
is required to match the schema patch behaviour. I've also added
a test case that tests this particular case.
What I meant was that with this new code we have regressed the old
behavior. Basically, imagine a case where no filter was given for any
of the tables. Then after the patch, we will remove all the old tables
whereas before the patch it will remove the oldrels only when they are
not specified as part of new rels. If you agree with this, then we can
retain the old behavior and for the new tables, we can always override
the where clause for a SET variant of command.
--
With Regards,
Amit Kapila.
On Fri, Nov 12, 2021 at 3:49 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
Some comments on 0006
--
/*
+ * Write UPDATE to the output stream using cached virtual slots.
+ * Cached updates will have both old tuple and new tuple.
+ */
+void
+logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
bool binary)
+{
Function, logicalrep_write_update_cached is exactly the same as
logicalrep_write_update, except calling logicalrep_write_tuple_cached
vs logicalrep_write_tuple. So I don't like the idea of making
complete duplicate copies. instead either we can keep a if check or we
can pass this logicalrep_write_tuple(_cached) as a function pointer.
--
Looking further, I realized that "logicalrep_write_tuple" and
"logicalrep_write_tuple_cached" are completely duplicate except first
one is calling "heap_deform_tuple" and then using local values[] array
and the second one is directly using the slot->values[] array, so in
fact we can pass this also as a parameter or we can put just one if
check the populate the values[] and null array, so if it is cached we
will point directly to the slot->values[] otherwise
heap_deform_tuple(), I think this should be just one simple check.
--
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
RelationSyncEntry *entry)
IMHO, the comments should explain how it is different from the
pgoutput_row_filter function. Also comments are saying "If it returns
true, the change is replicated, otherwise, it is not" which is not
exactly true for this function, I mean based on that the caller will
change the action. So I think it is enough to say what this function
is doing but not required to say what the caller will do based on what
this function returns.
--
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
Put some comments over this loop about what it is trying to do, and
overall I think there are not sufficient comments in the
pgoutput_row_filter_update_check function.
--
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 &&
VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
Is it ever possible that if the attribute is not NULL in the old slot
still it is stored as VARATT_IS_EXTERNAL_ONDISK? I think no, so
instead of adding
this last condition in check it should be asserted inside the if check.
--
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple
oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple
oldtuple, HeapTuple newtuple, RelationSyncEntry *entry,
ReorderBufferChangeType *action)
+{
This function definition header is too long to fit in one line, so
better to break it. I think running will be a good idea.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Nov 15, 2021 at 2:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Nov 12, 2021 at 3:49 PM Ajin Cherian <itsajin@gmail.com> wrote:
This function definition header is too long to fit in one line, so
better to break it. I think running will be a good idea.
It seems in the last line you are suggesting to run pgindent but it is
not clear as the word 'pgindent' is missing?
--
With Regards,
Amit Kapila.
On Mon, 15 Nov 2021 at 3:07 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 15, 2021 at 2:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Nov 12, 2021 at 3:49 PM Ajin Cherian <itsajin@gmail.com> wrote:
This function definition header is too long to fit in one line, so
better to break it. I think running will be a good idea.It seems in the last line you are suggesting to run pgindent but it is
not clear as the word 'pgindent' is missing?
Yeah I intended to suggest pgindent
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Nov 15, 2021 at 5:09 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
I met a problem when using "ALTER PUBLICATION ... SET TABLE ... WHERE ...", the
publisher was crashed after executing this statement.Backtrace is attached. I think maybe the problem is related to the below change in 0003 patch:
+ free(entry->exprstate);
I had a look at this crash problem and could reproduce it.
I made the following changes and it seemed to resolve the problem:
diff --git a/src/backend/replication/pgoutput/pgoutput.c
b/src/backend/replication/pgoutput/pgoutput.c
index e7f2fd4bad..f0cb9b8265 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -969,8 +969,6 @@ pgoutput_row_filter_init(PGOutputData *data,
Relation relation, RelationSyncEntr
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes,
-1) : linitial(rfnodes);
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
-
- list_free(rfnodes);
}
entry->rowfilter_valid = true;
@@ -1881,7 +1879,7 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
}
if (entry->exprstate != NULL)
{
- free(entry->exprstate);
+ pfree(entry->exprstate);
entry->exprstate = NULL;
}
}
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Nov 10, 2021 at 12:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:3) v37-0005
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
I think there could be other node type which can also be considered as simple
expression, for exmaple T_NullIfExpr.The current walker restrictions are from a previously agreed decision
by Amit/Tomas [1] and from an earlier suggestion from Andres [2] to
keep everything very simple for a first version.Yes, you are right, there might be some additional node types that
might be fine, but at this time I don't want to add anything different
without getting their approval to do so. Anyway, additions like this
are all candidates for a future version of this row-filter feature.
I think we can consider T_NullIfExpr unless you see any problem with the same.
Personally, I think it's natural to only check the IMMUTABLE and
whether-user-defined in the new function rowfilter_walker. We can keep the
other row-filter errors which were thrown for EXPR_KIND_PUBLICATION_WHERE in
the 0001 patch.YMMV. IMO it is much more convenient for all the filter validations to
be centralized just in one walker function instead of scattered all
over the place like they were in the 0001 patch.
+1.
Few comments on the latest set of patches (v39*)
=======================================
0001*
1.
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
I don't think such a renaming (targetrel-->pri) is warranted for this
patch. If we really want something like this, we can probably do it in
a separate patch but I suggest we can do that as a separate patch.
2.
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error thrown.
/error thrown/error is thrown
0003*
3. In pgoutput_row_filter(), the patch is finding pub_relid when it
should already be there in RelationSyncEntry->publish_as_relid found
during get_rel_sync_entry call. Is there a reason to do this work
again?
4. I think we should add some comments in pgoutput_row_filter() as to
why we are caching the row_filter here instead of
get_rel_sync_entry()? That has been discussed multiple times so it is
better to capture that in comments.
5. Why do you need a separate variable rowfilter_valid to indicate
whether a valid row filter exists? Why exprstate is not sufficient?
Can you update comments to indicate why we need this variable
separately?
0004*
6. In rowfilter_expr_checker(), the expression tree is traversed
twice, can't we traverse it once to detect all non-allowed stuff? It
can be sometimes costly to traverse the tree multiple times especially
when the expression is complex and it doesn't seem acceptable to do so
unless there is some genuine reason for the same.
7.
+static void
+rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)
Keep the rel argument before whereclause as that makes the function
signature better.
With Regards,
Amit Kapila.
On Fri, Nov 12, 2021 at 9:19 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
Thanks for the updated patch.
Some review comments:
doc/src/sgml/ref/create_publication.sgml
(1) improve comment
+ /* Set up a pstate to parse with */
"pstate" is the variable name, better to use "ParseState".
src/test/subscription/t/025_row_filter.pl
(2) rename TAP test 025 to 026
I suggest that the t/025_row_filter.pl TAP test should be renamed to
026 now because 025 is being used by some schema TAP test.
(3) whitespace errors
The 0006 patch applies with several whitespace errors.
(4) fix crash
The pgoutput.c patch that I previously posted on this thread needs to
be applied to fix the coredump issue reported by Tang-san.
While that fixes the crash, I haven't tracked through to see
where/whether the expression nodes are actually freed or whether now
there is a possible memory leak issue that may need further
investigation.
Regards,
Greg Nancarrow
On Friday, November 12, 2021 6:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
I met another problem when filtering out with the operator '~'.
Data can't be replicated as expected.
For example:
-- publisher
create table t (a text primary key);
create publication pub for table t where (a ~ 'aaa');
-- subscriber
create table t (a text primary key);
create subscription sub connection 'port=5432' publication pub;
-- publisher
insert into t values ('aaaaab');
insert into t values ('aaaaabc');
postgres=# select * from t where (a ~ 'aaa');
a
---------
aaaaab
aaaaabc
(2 rows)
-- subscriber
postgres=# select * from t;
a
--------
aaaaab
(1 row)
The second record can’t be replicated.
By the way, when only applied 0001 patch, I couldn't reproduce this bug.
So, I think it was related to the later patches.
Regards
Tang
On Mon, Nov 15, 2021 at 2:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Nov 12, 2021 at 3:49 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
I have reviewed, 0001* and I have a few comments on it
---
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent.
I think this comment is not correct, I think the correct statement
would be "only data that satisfies the row filters is pulled by the
subscriber"
---
---
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause.
I think this message is not correct, because for update also we can
not have filters on the non-key attribute right? Even w.r.t the first
patch also if the non update non key toast columns are there we can
not apply filters on those. So this comment seems misleading to me.
---
- Oid relid = RelationGetRelid(targetrel->relation);
..
+ relid = RelationGetRelid(targetrel);
+
Why this change is required, I mean instead of fetching the relid
during the variable declaration why do we need to do it separately
now?
---
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be
coerced to the expected type %s",
Instead of "coerced to" can we use "cast to"? That will be in sync
with other simmilar kind od user exposed error message.
----
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
.....
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
I can see the caller of this function is already switching to
CacheMemoryContext, so what is the point in doing it again here?
Maybe if called is expected to do show we can Asssert on the
CurrentMemoryContext.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Nov 16, 2021 at 7:33 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
The second record can’t be replicated.
By the way, when only applied 0001 patch, I couldn't reproduce this bug.
So, I think it was related to the later patches.
The problem seems to be caused by the 0006 patch (when I remove that
patch, the problem doesn't occur).
Still needs investigation.
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Nov 16, 2021 at 7:33 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Friday, November 12, 2021 6:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
I met another problem when filtering out with the operator '~'.
Data can't be replicated as expected.For example:
-- publisher
create table t (a text primary key);
create publication pub for table t where (a ~ 'aaa');-- subscriber
create table t (a text primary key);
create subscription sub connection 'port=5432' publication pub;-- publisher
insert into t values ('aaaaab');
insert into t values ('aaaaabc');
postgres=# select * from t where (a ~ 'aaa');
a
---------
aaaaab
aaaaabc
(2 rows)-- subscriber
postgres=# select * from t;
a
--------
aaaaab
(1 row)The second record can’t be replicated.
By the way, when only applied 0001 patch, I couldn't reproduce this bug.
So, I think it was related to the later patches.
I found that the problem was caused by allocating the WHERE clause
expression nodes in the wrong memory context (so they'd end up getting
freed after first-time use).
The following additions are needed in pgoutput_row_filter_init() - patch 0005.
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
rfnode = stringToNode(TextDatumGetCString(rfdatum));
rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
(these changes are needed in addition to the fixes I posted on this
thread for the crash problem that was previously reported)
Regards,
Greg Nancarrow
Fujitsu Australia
PSA new set of v40* patches.
This addresses multiple review comments as follows:
v40-0001 = the "main" patch
- not changed
v40-0002 = tab auto-complete.
- not changed
v40-0003 = cache updates.
- fix memory bug reported by Tang, using Greg's fix [Tang 15/11]
- fix unnecessary publish_as_relid code [Amit 15/11] #3
- add more comments about delayed caching [Amit 15/11] #4
- update comment for rowfilter_valid [Amit 15/11] #5
- fix regex bug reported by Tang, using Greg's fix [Tang 16/11]
v40-0004 = combine using OR instead of AND
- this is a new patch
- new behavior. multiple filters now combine by OR instead of AND
[Tomas 23/9] #3
v40-0005 = filter validation replica identity.
- previously this was v39-0004
- rearrange args for rowfilter_expr_checker [Amit 15/11] #7
v40-0006 = filter validation walker.
- previously this was v39-0005
- now allows NULLIF [Houz 8/11] #3
v40-0007 = support old/new tuple logic for row-filters.
- previously this was v39-0006
- fix typos [Peter 15/11] #1
- function logicalrep_write_tuple_cached use more common code [Peter
15/11] #2, [Dilip 15/11] #1
- make order of old/new consistent [Peter 15/11] #3
- guard elog to be more efficient [Peter 15/11] #4
- update typedefs.list [Peter 15/11] #5
- update comment for pgoutput_row_filter_virtual function [Dilip 15/11] #2
- add more comments in pgoutput_row_filter_update_check [Dilip 15/11] #3
- add assertion [Dilip 15/11] #4
------
[Tomas 23/9] /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
[Houz 8/11] /messages/by-id/OS0PR01MB571625D4A5CC1DAB4045B2BB94919@OS0PR01MB5716.jpnprd01.prod.outlook.com
[Tang 15/11] /messages/by-id/OS0PR01MB61138751816E2BF9A0BD6EC9FB989@OS0PR01MB6113.jpnprd01.prod.outlook.com
[Amit 15/11] /messages/by-id/CAA4eK1L4ddTpc=-3bq==U8O-BJ=svkAFefRDpATKCG4hKYKAig@mail.gmail.com
[Tang 16/11] /messages/by-id/OS0PR01MB61132C0E4FFEE73D34AE9823FB999@OS0PR01MB6113.jpnprd01.prod.outlook.com
[Peter 15/11] /messages/by-id/CAHut+PsZ2xsRZw4AyRQuLfO4gYiqCpNVNDRbv_RN1XUUo3KWsw@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v40-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v40-0001-Row-filter-for-logical-replication.patchDownload
From 13d93149a344e861b390eb23e77546b38af6fdc9 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 15 Nov 2021 14:44:47 +1100
Subject: [PATCH v40] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 50 ++++-
src/backend/commands/publicationcmds.c | 79 +++++---
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 +++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 4 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++++
src/test/regress/sql/publication.sql | 75 +++++++
src/test/subscription/t/025_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1143 insertions(+), 70 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/025_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index ca01d8c..52f6a1c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +259,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +277,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fed83b8..57d08e7 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -251,22 +254,28 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid;
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
+ relid = RelationGetRelid(targetrel);
+
/*
* Check for duplicates. Note that this does not really prevent
* duplicates, it's here just to provide nicer error message in common
@@ -283,10 +292,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -300,6 +329,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -316,6 +351,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..00a0a58 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,41 +529,30 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
-
- /* Calculate which relations to drop. */
+ /*
+ * Remove all publication-table mappings. We could possibly remove (i)
+ * tables that are not found in the new table list and (ii) tables that
+ * are being re-added with a different qual expression. For (ii),
+ * simply updating the existing tuple is not enough, because of qual
+ * expression dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
- bool found = false;
-
- foreach(newlc, rels)
- {
- PublicationRelInfo *newpubrel;
-
- newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
- {
- found = true;
- break;
- }
- }
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
- }
+ PublicationRelInfo *oldrel;
+
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->relid = oldrelid;
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrel->relid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
/* And drop them. */
@@ -899,6 +888,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +916,31 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = myrelid;
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -966,7 +972,10 @@ OpenTableList(List *tables)
/* find_all_inheritors already got lock */
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
+ pub_rel->relid = childrelid;
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +983,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1003,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1042,7 +1054,7 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(pub_rel->relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -1088,6 +1100,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index ad1ea2f..e3b0039 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4830,6 +4830,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index a6d0cef..f480846
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17341,7 +17359,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17354,6 +17373,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..077ae18 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1297,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1141,6 +1323,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1160,6 +1344,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1173,6 +1358,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1182,6 +1383,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1245,9 +1449,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1365,6 +1593,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1603,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1391,7 +1622,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7e98371..b404fd2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4229,6 +4229,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4239,9 +4240,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4250,6 +4258,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4290,6 +4299,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4360,8 +4373,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..964c204 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -85,7 +85,9 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +124,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2ff21a7..9e7f81d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 85a5302..21cc923 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
new file mode 100644
index 0000000..e806b5d
--- /dev/null
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v40-0004-PS-Combine-multiple-filters-with-OR-instead-of-A.patchapplication/octet-stream; name=v40-0004-PS-Combine-multiple-filters-with-OR-instead-of-A.patchDownload
From 367dbf99cc742a804a6b08734cfb1e638ccf46a1 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 18 Nov 2021 09:51:08 +1100
Subject: [PATCH v40] PS - Combine multiple filters with OR instead of AND.
This is a change of behavior requested by Tomas [1]. The subscription now is
treated "as a union of all the publications" so the filters are combined with
OR instead of AND.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Updated documentation.
Added more test cases.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
---
doc/src/sgml/ref/create_subscription.sgml | 27 +++++++++-----
src/backend/replication/logical/tablesync.c | 27 ++++++++++++--
src/backend/replication/pgoutput/pgoutput.c | 25 +++++++++++--
src/test/subscription/t/025_row_filter.pl | 56 +++++++++++++++++++++++++----
4 files changed, 115 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 5a9430e..42bf8c2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,15 +206,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>. If any table in the
- publications has a <literal>WHERE</literal> clause, rows that do not
- satisfy the <replaceable class="parameter">expression</replaceable>
- will not be copied. If the subscription has several publications in
- which a table has been published with different
- <literal>WHERE</literal> clauses, rows must satisfy all expressions
- to be copied. If the subscriber is a
- <productname>PostgreSQL</productname> version before 15 then any row
- filtering is ignored.
+ The default is <literal>true</literal>.
+ </para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
</para>
</listitem>
</varlistentry>
@@ -327,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 9d86a10..e9b7f7c 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -838,6 +838,13 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
nspname, relname, res->err)));
+ /*
+ * Multiple row-filter expressions for the same publication will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
@@ -847,6 +854,20 @@ fetch_remote_table_info(char *nspname, char *relname,
*qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
}
ExecDropSingleTupleTableSlot(slot);
@@ -896,7 +917,7 @@ copy_table(Relation rel)
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
* do SELECT * because we need to not copy generated columns. For
- * tables with any row filters, build a SELECT query with AND'ed row
+ * tables with any row filters, build a SELECT query with OR'ed row
* filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
@@ -908,7 +929,7 @@ copy_table(Relation rel)
}
appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
- /* list of AND'ed filters */
+ /* list of OR'ed filters */
if (qual != NIL)
{
ListCell *lc;
@@ -922,7 +943,7 @@ copy_table(Relation rel)
if (first)
first = false;
else
- appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, " OR ");
appendStringInfoString(&cmd, q);
}
list_free_deep(qual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3592468..aa7bdc2 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -797,6 +797,11 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
* NOTE: If the relation is a partition and pubviaroot is true, use
* the row filter of the topmost partitioned table instead of the row
* filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
*/
foreach(lc, data->publications)
{
@@ -825,12 +830,28 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
ReleaseSysCache(rftuple);
+
+ if (rfisnull)
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free(rfnodes);
+ rfnodes = NIL;
+ }
+ break;
+ }
}
} /* loop all subscribed publications */
/*
- * Combine all the row-filters (if any) into a single filter, and then build the ExprState for it
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
*/
n_filters = list_length(rfnodes);
if (n_filters > 0)
@@ -838,7 +859,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
Node *rfnode;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes, -1) : linitial(rfnodes);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
MemoryContextSwitchTo(oldctx);
}
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index e806b5d..abd88ad 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 7;
+use Test::More tests => 9;
# create publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
@@ -23,6 +23,8 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
);
$node_publisher->safe_psql('postgres',
@@ -45,6 +47,8 @@ $node_subscriber->safe_psql('postgres',
$node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
);
$node_subscriber->safe_psql('postgres',
@@ -86,6 +90,13 @@ $node_publisher->safe_psql('postgres',
"CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
#
# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
# SQL commands are for testing the initial data copy using logical replication.
@@ -103,6 +114,8 @@ $node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
# insert data into partitioned table and directly on the partition
$node_publisher->safe_psql('postgres',
@@ -115,7 +128,7 @@ $node_publisher->safe_psql('postgres',
my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
my $appname = 'tap_sub';
$node_subscriber->safe_psql('postgres',
- "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
);
$node_publisher->wait_for_catchup($appname);
@@ -143,14 +156,26 @@ is( $result, qq(1001|test 1001
# Check expected replicated rows for tab_rowfilter_2
# tap_pub_1 filter is: (c % 2 = 0)
# tap_pub_2 filter is: (c % 3 = 0)
-# When there are multiple publications for the same table, all filter
-# expressions should succeed. In this case, rows are replicated if c value is
-# divided by 2 AND 3 (6, 12, 18).
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
#
$result =
$node_subscriber->safe_psql('postgres',
"SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
-is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
# Check expected replicated rows for tab_rowfilter_3
# There is no filter. 10 rows are inserted, so 10 rows are replicated.
@@ -210,9 +235,28 @@ $node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
$node_publisher->safe_psql('postgres',
"DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (11)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (12)");
$node_publisher->wait_for_catchup($appname);
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
# Check expected replicated rows for tab_rowfilter_1
# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
#
--
1.8.3.1
v40-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v40-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 90d3f795d1bc8a864ae350dcb4e1bd75c41e6e85 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 15 Nov 2021 15:00:07 +1100
Subject: [PATCH v40] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f724e4..8c7fe7d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2757,10 +2765,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v40-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v40-0003-PS-ExprState-cache-modifications.patchDownload
From 8c3fc8240d18f3b78d004d384903fe380aa4b8ac Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 18 Nov 2021 09:31:54 +1100
Subject: [PATCH v40] PS - ExprState cache modifications.
Now the cached row-filters (e.g. ExprState *) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
If there are multiple publication filters for a given table these are are all
combined into a single filter.
Author: Peter Smith, Greg Nancarrow
Changes are based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
src/backend/replication/pgoutput/pgoutput.c | 214 ++++++++++++++++++----------
1 file changed, 139 insertions(+), 75 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 077ae18..3592468 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1,4 +1,4 @@
-/*-------------------------------------------------------------------------
+/*------------------------------------------------------------------------
*
* pgoutput.c
* Logical Replication output plugin
@@ -21,6 +21,7 @@
#include "executor/executor.h"
#include "fmgr.h"
#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
#include "optimizer/optimizer.h"
#include "parser/parse_coerce.h"
#include "replication/logical.h"
@@ -123,7 +124,16 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +171,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +741,118 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. So the decision was to defer
+ * this logic to last moment when we know it will be needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (!entry->exprstate)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -757,20 +865,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
/*
- * If the subscription has multiple publications and the same table has a
- * different row filter in these publications, all row filters must be
- * matched in order to replicate this change.
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
*/
- foreach(lc, entry->exprstate)
+ if (entry->exprstate)
{
- ExprState *exprstate = (ExprState *) lfirst(lc);
-
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
-
- /* If the tuple does not match one of the row filters, bail out */
- if (!result)
- break;
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
}
/* Cleanup allocated resources */
@@ -840,7 +941,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +974,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1008,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1321,10 +1422,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1344,7 +1446,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1358,22 +1459,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1383,9 +1468,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1449,33 +1531,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1582,6 +1637,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1622,12 +1692,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v40-0005-PS-Row-filter-validation-of-replica-identity.patchapplication/octet-stream; name=v40-0005-PS-Row-filter-validation-of-replica-identity.patchDownload
From 5ed8f87c48f9812a8a4aee73f7c54c1a9180d95d Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 18 Nov 2021 10:01:02 +1100
Subject: [PATCH v40] PS - Row filter validation of replica identity.
This patch introduces some additional row filter validation. Currently it is
implemented only for the publish mode "delete" and it validates that any columns
referenced in the filter expression must be part of REPLICA IDENTITY or Primary
Key.
Also updated test code and PG docs.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
---
doc/src/sgml/ref/create_publication.sgml | 5 +-
src/backend/catalog/pg_publication.c | 97 ++++++++++++++++++++++++++-
src/test/regress/expected/publication.out | 105 +++++++++++++++++++++++++++---
src/test/regress/sql/publication.sql | 83 ++++++++++++++++++++++-
src/test/subscription/t/025_row_filter.pl | 7 +-
5 files changed, 277 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 52f6a1c..03cc956 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -231,8 +231,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
The <literal>WHERE</literal> clause should contain only columns that are
- part of the primary key or be covered by <literal>REPLICA
- IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> operations will not
be replicated. That's because old row is used and it only contains primary
key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 57d08e7..d99aa4e 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,7 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -213,10 +214,99 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+typedef struct {
+ Relation rel;
+ Bitmapset *bms_replident;
+}
+rf_context;
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Walk the row-filter expression to find check that all the referenced columns
+ * are permitted, else error.
*/
+static bool
+rowfilter_expr_replident_walker(Node *node, rf_context *context)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+
+ return true;
+ }
+
+ return expression_tree_walker(node, rowfilter_expr_replident_walker,
+ (void *) context);
+}
+
+/*
+ * Decide if the row-filter is valid according to the following rules:
+ *
+ * Rule 1. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
+ *
+ * Rule 2. TODO
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ /*
+ * Rule 1: For "delete", check that filter cols are also valid replica
+ * identity cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ Bitmapset *bms_okcols;
+ rf_context context = {0};
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ bms_okcols = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ context.rel = rel;
+ context.bms_replident = bms_okcols;
+ (void) rowfilter_expr_replident_walker(rfnode, &context);
+
+ bms_free(bms_okcols);
+ }
+ }
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,6 +405,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 9e7f81d..5e7fe01 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -387,6 +389,91 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 21cc923..b127605 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -209,6 +211,81 @@ DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index abd88ad..2703470 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -267,9 +269,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -278,7 +278,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v40-0006-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v40-0006-PS-Row-filter-validation-walker.patchDownload
From db6a9b9b847ae6c0943d8691ea4a48ddb9962bfb Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 18 Nov 2021 10:27:30 +1100
Subject: [PATCH v40] PS - Row filter validation walker.
This patch implements a parse-tree "walker" to validate a row-filter expression.
Only very simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
Some regression tests are updated due to modified validation messages and rules.
Author: Peter Smith
---
src/backend/catalog/dependency.c | 94 +++++++++++++++++++++++++++++++
src/backend/catalog/pg_publication.c | 18 ++++--
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/include/catalog/dependency.h | 3 +-
src/test/regress/expected/publication.out | 29 +++++++---
src/test/regress/sql/publication.sql | 15 ++++-
9 files changed, 157 insertions(+), 18 deletions(-)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index fe9c714..fd1d0a6 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -133,6 +133,12 @@ typedef struct
int subflags; /* flags to pass down when recursing to obj */
} ObjectAddressAndFlags;
+/* for rowfilter_walker */
+typedef struct
+{
+ char *relname;
+} rf_context;
+
/* for find_expr_references_walker */
typedef struct
{
@@ -1569,6 +1575,94 @@ ReleaseDeletionLock(const ObjectAddress *object)
}
/*
+ * Walker checks that the row filter extression is legal. Allow only simple or
+ * or compound expressions like:
+ *
+ * "(Var Op Const)" or
+ * "(Var Op Const) Bool (Var Op Const)"
+ *
+ * User-defined operators are not allowed.
+ * User-defined functions are not allowed.
+ * System functions that are not IMMUTABLE are not allowed.
+ * NULLIF is allowed.
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ context->relname),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if it encounters anything not permitted or unexpected.
+ */
+void
+rowfilter_validator(char *relname, Node *expr)
+{
+ rf_context context = {0};
+
+ context.relname = relname;
+ rowfilter_walker(expr, &context);
+}
+
+/*
* recordDependencyOnExpr - find expression dependencies
*
* This is used to find the dependencies of rules, constraint expressions,
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d99aa4e..7174a56 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -258,17 +258,25 @@ rowfilter_expr_replident_walker(Node *node, rf_context *context)
/*
* Decide if the row-filter is valid according to the following rules:
*
- * Rule 1. If the publish operation contains "delete" then only columns that
+ * Rule 1. Walk the parse-tree and reject anything other than very simple
+ * expressions. (See rowfilter_validator for details what is permitted).
+ *
+ * Rule 2. If the publish operation contains "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
- *
- * Rule 2. TODO
*/
static void
rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
{
+ char *relname = RelationGetRelationName(rel);
+
+ /*
+ * Rule 1. Walk the parse-tree and reject anything unexpected.
+ */
+ rowfilter_validator(relname, rfnode);
+
/*
- * Rule 1: For "delete", check that filter cols are also valid replica
+ * Rule 2: For "delete", check that filter cols are also valid replica
* identity cols.
*
* TODO - check later for publish "update" case.
@@ -401,7 +409,7 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 3eca295..ab9bfe3 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -16,7 +16,6 @@
#include "catalog/objectaddress.h"
-
/*
* Precise semantics of a dependency relationship are specified by the
* DependencyType code (which is stored in a "char" field in pg_depend,
@@ -152,6 +151,8 @@ extern void performDeletion(const ObjectAddress *object,
extern void performMultipleDeletions(const ObjectAddresses *objects,
DropBehavior behavior, int flags);
+extern void rowfilter_validator(char *relname, Node *expr);
+
extern void recordDependencyOnExpr(const ObjectAddress *depender,
Node *expr, List *rtable,
DependencyType behavior);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5e7fe01..4c13f93 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -355,19 +355,31 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -389,6 +401,7 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index b127605..4086d61 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -184,13 +184,23 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -210,6 +220,7 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
-- ======================================================
-- More row filter tests for validating column references
--
1.8.3.1
v40-0007-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v40-0007-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 72e0430375c827ab19fea58c3385b788578c7d31 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 18 Nov 2021 11:36:22 +1100
Subject: [PATCH v40] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/catalog/pg_publication.c | 12 +-
src/backend/replication/logical/proto.c | 35 ++--
src/backend/replication/pgoutput/pgoutput.c | 243 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/025_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
7 files changed, 267 insertions(+), 41 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 7174a56..09f0981 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -261,9 +261,9 @@ rowfilter_expr_replident_walker(Node *node, rf_context *context)
* Rule 1. Walk the parse-tree and reject anything other than very simple
* expressions. (See rowfilter_validator for details what is permitted).
*
- * Rule 2. If the publish operation contains "delete" then only columns that
- * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
- * row-filter WHERE clause.
+ * Rule 2. If the publish operation contains "delete" or "update" then only
+ * columns that are allowed by the REPLICA IDENTITY rules are permitted to
+ * be used in the row-filter WHERE clause.
*/
static void
rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
@@ -276,12 +276,10 @@ rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
rowfilter_validator(relname, rfnode);
/*
- * Rule 2: For "delete", check that filter cols are also valid replica
+ * Rule 2: For "delete" or "update", check that filter cols are also valid replica
* identity cols.
- *
- * TODO - check later for publish "update" case.
*/
- if (pub->pubactions.pubdelete)
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
{
char replica_identity = rel->rd_rel->relreplident;
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index aa7bdc2..a0d1455 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -134,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -169,10 +172,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -736,17 +745,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -774,7 +878,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -783,9 +887,28 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -866,6 +989,67 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * The row is passed in as a virtual slot.
+ *
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(entry->relid)),
+ get_rel_name(entry->relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -896,7 +1080,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -954,6 +1137,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -962,7 +1148,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -993,9 +1179,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1018,8 +1205,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1029,7 +1235,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1447,6 +1653,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/025_row_filter.pl b/src/test/subscription/t/025_row_filter.pl
index 2703470..96bfe2b 100644
--- a/src/test/subscription/t/025_row_filter.pl
+++ b/src/test/subscription/t/025_row_filter.pl
@@ -264,7 +264,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -276,7 +277,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8e..2f41eac 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2194,6 +2194,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
On Thu, Sep 23, 2021 at 10:33 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
3) create_subscription.sgml
<literal>WHERE</literal> clauses, rows must satisfy all expressions
to be copied. If the subscriber is aI'm rather skeptical about the principle that all expressions have to
match - I'd have expected exactly the opposite behavior, actually.I see a subscription as "a union of all publications". Imagine for
example you have a data set for all customers, and you create a
publication for different parts of the world, likeCREATE PUBLICATION customers_france
FOR TABLE customers WHERE (country = 'France');CREATE PUBLICATION customers_germany
FOR TABLE customers WHERE (country = 'Germany');CREATE PUBLICATION customers_usa
FOR TABLE customers WHERE (country = 'USA');and now you want to subscribe to multiple publications, because you want
to replicate data for multiple countries (e.g. you want EU countries).
But if you doCREATE SUBSCRIPTION customers_eu
PUBLICATION customers_france, customers_germany;then you won't get anything, because each customer belongs to just a
single country. Yes, I could create multiple individual subscriptions,
one for each country, but that's inefficient and may have a different
set of issues (e.g. keeping them in sync when a customer moves between
countries).I might have missed something, but I haven't found any explanation why
the requirement to satisfy all expressions is the right choice.IMHO this should be 'satisfies at least one expression' i.e. we should
connect the expressions by OR, not AND.
Fixed in V40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
-----
[1]: /messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 10, 2021 at 12:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:3) v37-0005
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
I think there could be other node type which can also be considered as simple
expression, for exmaple T_NullIfExpr.The current walker restrictions are from a previously agreed decision
by Amit/Tomas [1] and from an earlier suggestion from Andres [2] to
keep everything very simple for a first version.Yes, you are right, there might be some additional node types that
might be fine, but at this time I don't want to add anything different
without getting their approval to do so. Anyway, additions like this
are all candidates for a future version of this row-filter feature.I think we can consider T_NullIfExpr unless you see any problem with the same.
Added in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
Few comments on the latest set of patches (v39*)
=======================================
...
0003*
3. In pgoutput_row_filter(), the patch is finding pub_relid when it
should already be there in RelationSyncEntry->publish_as_relid found
during get_rel_sync_entry call. Is there a reason to do this work
again?
Fixed in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
4. I think we should add some comments in pgoutput_row_filter() as to
why we are caching the row_filter here instead of
get_rel_sync_entry()? That has been discussed multiple times so it is
better to capture that in comments.
Added comment in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
5. Why do you need a separate variable rowfilter_valid to indicate
whether a valid row filter exists? Why exprstate is not sufficient?
Can you update comments to indicate why we need this variable
separately?
I have improved the (existing) comment in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com.
0004*
6. In rowfilter_expr_checker(), the expression tree is traversed
twice, can't we traverse it once to detect all non-allowed stuff? It
can be sometimes costly to traverse the tree multiple times especially
when the expression is complex and it doesn't seem acceptable to do so
unless there is some genuine reason for the same.
I kind of doubt there would be any perceptible difference for 2
traverses instead of 1 because:
a) filters are limited to simple expressions. Yes, a large boolean
expression is possible but I don't think it is likely.
b) the validation part is mostly a one-time execution only when the
filter is created or changed.
Anyway, I am happy to try to refactor the logic to a single traversal
as suggested, but I'd like to combine those "validation" patches
(v40-0005, v40-0006) first, so I can combine their walker logic. Is it
OK?
7. +static void +rowfilter_expr_checker(Publication *pub, Node *rfnode, Relation rel)Keep the rel argument before whereclause as that makes the function
signature better.
Fixed in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
-----
[1]: /messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Nov 15, 2021 at 5:09 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Friday, November 12, 2021 6:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
Thanks for the new patch.
I met a problem when using "ALTER PUBLICATION ... SET TABLE ... WHERE ...", the
publisher was crashed after executing this statement.Here is some information about this problem.
Steps to reproduce:
-- publisher
create table t(a int primary key, b int);
create publication pub for table t where (a>5);-- subscriber
create table t(a int primary key, b int);
create subscription sub connection 'dbname=postgres port=5432' publication pub;-- publisher
insert into t values (1, 2);
alter publication pub set table t where (a>7);Publisher log:
2021-11-15 13:36:54.997 CST [3319891] LOG: logical decoding found consistent point at 0/15208B8
2021-11-15 13:36:54.997 CST [3319891] DETAIL: There are no running transactions.
2021-11-15 13:36:54.997 CST [3319891] STATEMENT: START_REPLICATION SLOT "sub" LOGICAL 0/0 (proto_version '3', publication_names '"pub"')
double free or corruption (out)
2021-11-15 13:36:55.072 CST [3319746] LOG: received fast shutdown request
2021-11-15 13:36:55.073 CST [3319746] LOG: aborting any active transactions
2021-11-15 13:36:55.105 CST [3319746] LOG: background worker "logical replication launcher" (PID 3319874) exited with exit code 1
2021-11-15 13:36:55.105 CST [3319869] LOG: shutting down
2021-11-15 13:36:55.554 CST [3319746] LOG: server process (PID 3319891) was terminated by signal 6: Aborted
2021-11-15 13:36:55.554 CST [3319746] DETAIL: Failed process was running: START_REPLICATION SLOT "sub" LOGICAL 0/0 (proto_version '3', publication_names '"pub"')
2021-11-15 13:36:55.554 CST [3319746] LOG: terminating any other active server processesBacktrace is attached. I think maybe the problem is related to the below change in 0003 patch:
+ free(entry->exprstate);
Fixed in V40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com using a fix provided by Greg Nancarrow.
-----
[1]: /messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 16, 2021 at 7:33 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Friday, November 12, 2021 6:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
I met another problem when filtering out with the operator '~'.
Data can't be replicated as expected.For example:
-- publisher
create table t (a text primary key);
create publication pub for table t where (a ~ 'aaa');-- subscriber
create table t (a text primary key);
create subscription sub connection 'port=5432' publication pub;-- publisher
insert into t values ('aaaaab');
insert into t values ('aaaaabc');
postgres=# select * from t where (a ~ 'aaa');
a
---------
aaaaab
aaaaabc
(2 rows)-- subscriber
postgres=# select * from t;
a
--------
aaaaab
(1 row)The second record can’t be replicated.
By the way, when only applied 0001 patch, I couldn't reproduce this bug.
So, I think it was related to the later patches.
Fixed in V40-0003 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com using a fix provided by Greg Nancarrow.
-----
[1]: /messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Nov 15, 2021 at 12:01 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 12, 2021 at 9:19 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
Here are some review comments for v39-0006:
1) @@ -261,9 +261,9 @@ rowfilter_expr_replident_walker(Node *node, rf_context *context) * Rule 1. Walk the parse-tree and reject anything other than very simple * expressions (See rowfilter_validator for details on what is permitted). * - * Rule 2. If the publish operation contains "delete" then only columns that - * are allowed by the REPLICA IDENTITY rules are permitted to be used in the - * row-filter WHERE clause. + * Rule 2. If the publish operation contains "delete" or "delete" then only + * columns that are allowed by the REPLICA IDENTITY rules are permitted to + * be used in the row-filter WHERE clause. */ static void rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relation rel) @@ -276,12 +276,10 @@ rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node *rfnode, Relat rowfilter_validator(relname, rfnode);/* - * Rule 2: For "delete", check that filter cols are also valid replica + * Rule 2: For "delete" and "update", check that filter cols are also valid replica * identity cols. - * - * TODO - check later for publish "update" case. */ - if (pub->pubactions.pubdelete)1a)
Typo - the function comment: "delete" or "delete"; should say:
"delete" or "update"1b)
I felt it would be better (for the comment in the function body) to
write it as "or" instead of "and" because then it matches with the
code "if ||" that follows this comment.====
2)
@@ -746,6 +780,92 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
}/* + * Write a tuple to the outputstream using cached slot, in the most efficient format possible. + */ +static void +logicalrep_write_tuple_cached(StringInfo out, Relation rel, TupleTableSlot *slot, bool binary)The function logicalrep_write_tuple_cached seems to have almost all of
its function body in common with logicalrep_write_tuple. Is there any
good way to combine these functions to avoid ~80 lines mostly
duplicated code?====
3) + if (!old_matched && !new_matched) + return false; + + if (old_matched && new_matched) + *action = REORDER_BUFFER_CHANGE_UPDATE; + else if (old_matched && !new_matched) + *action = REORDER_BUFFER_CHANGE_DELETE; + else if (new_matched && !old_matched) + *action = REORDER_BUFFER_CHANGE_INSERT; + + return true;I felt it is slightly confusing to have inconsistent ordering of the
old_matched and new_matched in those above conditions.I suggest to use the order like:
* old-row (no match) new-row (no match)
* old-row (no match) new row (match)
* old-row (match) new-row (no match)
* old-row (match) new row (match)And then be sure to keep consistent ordering in all places it is mentioned:
* in the code
* in the function header comment
* in the commit comment
* in docs?====
4) +/* + * Change is checked against the row filter, if any. + * + * If it returns true, the change is replicated, otherwise, it is not. + */ +static bool +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry) +{ + EState *estate; + ExprContext *ecxt; + bool result = true; + Oid relid = RelationGetRelid(relation); + + /* Bail out if there is no row filter */ + if (!entry->exprstate) + return true; + + elog(DEBUG3, "table \"%s.%s\" has row filter", + get_namespace_name(get_rel_namespace(relid)), + get_rel_name(relid));It seems like that elog may consume unnecessary CPU most of the time.
I think it might be better to remove the relid declaration and rewrite
that elog as:if (message_level_is_interesting(DEBUG3))
elog(DEBUG3, "table \"%s.%s\" has row filter",
get_namespace_name(get_rel_namespace(entry->relid)),
get_rel_name(entry->relid));====
5) diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h index 5b40ff7..aec0059 100644 --- a/src/include/replication/reorderbuffer.h +++ b/src/include/replication/reorderbuffer.h @@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf * respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of * logical decoding don't have to care about these. */ -enum ReorderBufferChangeType +typedef enum ReorderBufferChangeType { REORDER_BUFFER_CHANGE_INSERT, REORDER_BUFFER_CHANGE_UPDATE, @@ -65,7 +65,7 @@ enum ReorderBufferChangeType REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM, REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT, REORDER_BUFFER_CHANGE_TRUNCATE -}; +} ReorderBufferChangeType;This new typedef can be added to src/tools/pgindent/typedefs.list.
All above are fixed by Ajin Cherian in V40-0006 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com.
-----
[1]: /messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Nov 15, 2021 at 8:14 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Nov 12, 2021 at 3:49 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching version 39-
Some comments on 0006
--
...
--
Looking further, I realized that "logicalrep_write_tuple" and
"logicalrep_write_tuple_cached" are completely duplicate except first
one is calling "heap_deform_tuple" and then using local values[] array
and the second one is directly using the slot->values[] array, so in
fact we can pass this also as a parameter or we can put just one if
check the populate the values[] and null array, so if it is cached we
will point directly to the slot->values[] otherwise
heap_deform_tuple(), I think this should be just one simple check.
Fixed in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
-- + +/* + * Change is checked against the row filter, if any. + * + * If it returns true, the change is replicated, otherwise, it is not. + */ +static bool +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)IMHO, the comments should explain how it is different from the
pgoutput_row_filter function. Also comments are saying "If it returns
true, the change is replicated, otherwise, it is not" which is not
exactly true for this function, I mean based on that the caller will
change the action. So I think it is enough to say what this function
is doing but not required to say what the caller will do based on what
this function returns.
Fixed in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com.
--
+ for (i = 0; i < desc->natts; i++) + { + Form_pg_attribute att = TupleDescAttr(desc, i); + + /* if the column in the new_tuple is null, nothing to do */ + if (tmp_new_slot->tts_isnull[i]) + continue;Put some comments over this loop about what it is trying to do, and
overall I think there are not sufficient comments in the
pgoutput_row_filter_update_check function.
Fixed in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com.
-- + /* + * Unchanged toasted replica identity columns are + * only detoasted in the old tuple, copy this over to the newtuple. + */ + if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) && + (!old_slot->tts_isnull[i] && + !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))Is it ever possible that if the attribute is not NULL in the old slot
still it is stored as VARATT_IS_EXTERNAL_ONDISK? I think no, so
instead of adding
this last condition in check it should be asserted inside the if check.
Fixed in v40 [1]/messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
-----
[1]: /messages/by-id/CAHut+Pv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
Thanks for the patch updates.
A couple of comments so far:
(1) compilation warning
WIth the patches applied, there's a single compilation warning when
Postgres is built:
pgoutput.c: In function ‘pgoutput_row_filter_init’:
pgoutput.c:854:8: warning: unused variable ‘relid’ [-Wunused-variable]
Oid relid = RelationGetRelid(relation);
^~~~~
v40-0004 = combine using OR instead of AND
- this is a new patch
- new behavior. multiple filters now combine by OR instead of AND
[Tomas 23/9] #3
(2) missing test case
It seems that the current tests are not testing the
multiple-row-filter case (n_filters > 1) in the following code in
pgoutput_row_filter_init():
rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) :
linitial(rfnodes);
I think a test needs to be added similar to the customers+countries
example that Tomas gave (where there is a single subscription to
multiple publications of the same table, each of which has a
row-filter).
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Nov 18, 2021 at 11:02 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
5. Why do you need a separate variable rowfilter_valid to indicate
whether a valid row filter exists? Why exprstate is not sufficient?
Can you update comments to indicate why we need this variable
separately?I have improved the (existing) comment in v40 [1].
0004*
6. In rowfilter_expr_checker(), the expression tree is traversed
twice, can't we traverse it once to detect all non-allowed stuff? It
can be sometimes costly to traverse the tree multiple times especially
when the expression is complex and it doesn't seem acceptable to do so
unless there is some genuine reason for the same.I kind of doubt there would be any perceptible difference for 2
traverses instead of 1 because:
a) filters are limited to simple expressions. Yes, a large boolean
expression is possible but I don't think it is likely.
But in such cases, it will be quite costly and more importantly, I
don't see any good reason why we need to traverse it twice..
b) the validation part is mostly a one-time execution only when the
filter is created or changed.Anyway, I am happy to try to refactor the logic to a single traversal
as suggested, but I'd like to combine those "validation" patches
(v40-0005, v40-0006) first, so I can combine their walker logic. Is it
OK?
That should be okay. You can combine the logic of v40-0005 and
v40-0006, and then change it so that you need to traverse the
expression once.
--
With Regards,
Amit Kapila.
On Thu, Nov 18, 2021 at 4:32 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 10, 2021 at 12:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:3) v37-0005
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
I think there could be other node type which can also be considered as simple
expression, for exmaple T_NullIfExpr.The current walker restrictions are from a previously agreed decision
by Amit/Tomas [1] and from an earlier suggestion from Andres [2] to
keep everything very simple for a first version.Yes, you are right, there might be some additional node types that
might be fine, but at this time I don't want to add anything different
without getting their approval to do so. Anyway, additions like this
are all candidates for a future version of this row-filter feature.I think we can consider T_NullIfExpr unless you see any problem with the same.
Added in v40 [1]
I've noticed that row-filters that are testing NULL cannot pass the
current expression validation restrictions.
e.g.1
test_pub=# create publication ptest for table t1 where (a is null);
ERROR: invalid publication WHERE expression for relation "t1"
HINT: only simple expressions using columns, constants and immutable
system functions are allowed
e.g.2
test_pub=# create publication ptest for table t1 where (a is not null);
ERROR: invalid publication WHERE expression for relation "t1"
HINT: only simple expressions using columns, constants and immutable
system functions are allowed
So I think it would be useful to permit the NullTest also. Is it OK?
------
KInd Regards,
Peter Smith.
Fujitsu Australia
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Few comments on the latest set of patches (v39*) ======================================= 0001* 1. ObjectAddress -publication_add_relation(Oid pubid, PublicationRelInfo *targetrel, +publication_add_relation(Oid pubid, PublicationRelInfo *pri, bool if_not_exists) { Relation rel; HeapTuple tup; Datum values[Natts_pg_publication_rel]; bool nulls[Natts_pg_publication_rel]; - Oid relid = RelationGetRelid(targetrel->relation); + Relation targetrel = pri->relation;I don't think such a renaming (targetrel-->pri) is warranted for this
patch. If we really want something like this, we can probably do it in
a separate patch but I suggest we can do that as a separate patch.
The name "targetrel" implies it is a Relation. (and historically, this
arg once was "Relation *targetrel").
Then when the PublicationRelInfo struct was introduced the arg name
was not changed and it became "PublicationRelInfo *targetrel". But at
that time PublicationRelInfo was just a simple wrapper for a Relation
so that was probably ok.
But now this Row-Filter patch has added more new members to
PublicationRelInfo, so IMO the name change is helpful otherwise it
seems misleading to continue calling it like it was still just a
Relation.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Nov 19, 2021 at 3:16 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Nov 18, 2021 at 4:32 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 10, 2021 at 12:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 8, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:3) v37-0005
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr
I think there could be other node type which can also be considered as simple
expression, for exmaple T_NullIfExpr.The current walker restrictions are from a previously agreed decision
by Amit/Tomas [1] and from an earlier suggestion from Andres [2] to
keep everything very simple for a first version.Yes, you are right, there might be some additional node types that
might be fine, but at this time I don't want to add anything different
without getting their approval to do so. Anyway, additions like this
are all candidates for a future version of this row-filter feature.I think we can consider T_NullIfExpr unless you see any problem with the same.
Added in v40 [1]
I've noticed that row-filters that are testing NULL cannot pass the
current expression validation restrictions.e.g.1
test_pub=# create publication ptest for table t1 where (a is null);
ERROR: invalid publication WHERE expression for relation "t1"
HINT: only simple expressions using columns, constants and immutable
system functions are allowede.g.2
test_pub=# create publication ptest for table t1 where (a is not null);
ERROR: invalid publication WHERE expression for relation "t1"
HINT: only simple expressions using columns, constants and immutable
system functions are allowedSo I think it would be useful to permit the NullTest also. Is it OK?
Yeah, I think such simple expressions should be okay but we need to
test left-side expressions for simplicity.
--
With Regards,
Amit Kapila.
On Fri, Nov 19, 2021 at 5:35 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...
Few comments on the latest set of patches (v39*) ======================================= 0001* 1. ObjectAddress -publication_add_relation(Oid pubid, PublicationRelInfo *targetrel, +publication_add_relation(Oid pubid, PublicationRelInfo *pri, bool if_not_exists) { Relation rel; HeapTuple tup; Datum values[Natts_pg_publication_rel]; bool nulls[Natts_pg_publication_rel]; - Oid relid = RelationGetRelid(targetrel->relation); + Relation targetrel = pri->relation;I don't think such a renaming (targetrel-->pri) is warranted for this
patch. If we really want something like this, we can probably do it in
a separate patch but I suggest we can do that as a separate patch.The name "targetrel" implies it is a Relation. (and historically, this
arg once was "Relation *targetrel").Then when the PublicationRelInfo struct was introduced the arg name
was not changed and it became "PublicationRelInfo *targetrel". But at
that time PublicationRelInfo was just a simple wrapper for a Relation
so that was probably ok.But now this Row-Filter patch has added more new members to
PublicationRelInfo, so IMO the name change is helpful otherwise it
seems misleading to continue calling it like it was still just a
Relation.
Okay, that sounds reasonable.
--
With Regards,
Amit Kapila.
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
I notice that in the 0001 patch, it adds a "relid" member to the
PublicationRelInfo struct:
src/include/catalog/pg_publication.h
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
It appears that this new member is not actually required, as the relid
can be simply obtained from the existing "relation" member - using the
RelationGetRelid() macro.
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Nov 19, 2021 at 4:15 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
I notice that in the 0001 patch, it adds a "relid" member to the
PublicationRelInfo struct:src/include/catalog/pg_publication.h
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;It appears that this new member is not actually required, as the relid
can be simply obtained from the existing "relation" member - using the
RelationGetRelid() macro.
+1
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
Another thing I noticed was in the 0004 patch, list_free_deep() should
be used instead of list_free() in the following code block, otherwise
the rfnodes themselves (allocated by stringToNode()) are not freed:
src/backend/replication/pgoutput/pgoutput.c
+ if (rfnodes)
+ {
+ list_free(rfnodes);
+ rfnodes = NIL;
+ }
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Nov 22, 2021 at 7:14 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
I have a few more comments on 0007,
@@ -783,9 +887,28 @@ pgoutput_row_filter(PGOutputData *data, Relation
relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
in pgoutput_row_filter, we are dropping the slots if there are some
old slots in the RelationSyncEntry. But then I noticed that in
rel_sync_cache_relation_cb(), also we are doing that but only for the
scantuple slot. So IMHO, rel_sync_cache_relation_cb(), is only place
setting entry->rowfilter_valid to false; so why not drop all the slot
that time only and in pgoutput_row_filter(), you can just put an
assert?
2.
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
pgoutput_row_filter_virtual and pgoutput_row_filter are exactly same
except, ExecStoreHeapTuple(), so why not just put one check based on
whether a slot is passed or not, instead of making complete duplicate
copy of the function.
3.
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
Why do we need to copy the tupledesc? do we think that we need to have
this slot even if we close the relation, if so can you add the
comments explaining why we are making a copy here.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thursday, November 18, 2021 9:34 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
I found a problem on v40. The check for Replica Identity in WHERE clause is not working properly.
For example:
postgres=# create table tbl(a int primary key, b int);
CREATE TABLE
postgres=# create publication pub1 for table tbl where (a>10 and b>10);
CREATE PUBLICATION
I think it should report an error because column b is not part of Replica Identity.
This seems due to "return true" in rowfilter_expr_replident_walker function,
maybe we should remove it.
Besides, a small comment on 0004 patch:
+ * Multiple row-filter expressions for the same publication will later be
+ * combined by the COPY using OR, but this means if any of the filters is
Should we change it to:
Multiple row-filter expressions for the same table ...
Regards,
Tang
On Tue, Nov 23, 2021 at 4:40 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Thursday, November 18, 2021 9:34 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
I found a problem on v40. The check for Replica Identity in WHERE clause is not working properly.
For example:
postgres=# create table tbl(a int primary key, b int);
CREATE TABLE
postgres=# create publication pub1 for table tbl where (a>10 and b>10);
CREATE PUBLICATIONI think it should report an error because column b is not part of Replica Identity.
This seems due to "return true" in rowfilter_expr_replident_walker function,
maybe we should remove it.
This has already been fixed in v41* updates. Please retest when v41* is posted.
Besides, a small comment on 0004 patch:
+ * Multiple row-filter expressions for the same publication will later be + * combined by the COPY using OR, but this means if any of the filters isShould we change it to:
Multiple row-filter expressions for the same table ...
Yes, thanks for reporting. (added to my TODO list)
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
Few comments:
1) When a table is added to the publication, replica identity is
checked. But while modifying the publish action to include
delete/update, replica identity is not checked for the existing
tables. I felt it should be checked for the existing tables too.
@@ -315,6 +405,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
postgres=# create publication pub1 for table t1 where ( c1 = 10);
ERROR: cannot add relation "t1" to publication
DETAIL: Row filter column "c1" is not part of the REPLICA IDENTITY
postgres=# create publication pub1 for table t1 where ( c1 = 10) with
(PUBLISH = INSERT);
CREATE PUBLICATION
postgres=# alter publication pub1 set (PUBLISH=DELETE);
ALTER PUBLICATION
2) Since the error message is because it publishes delete/update
operations, it should include publish delete/update in the error
message. Can we change the error message:
+ if (!bms_is_member(attnum -
FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation
\"%s\" to publication",
+
RelationGetRelationName(context->rel)),
+ errdetail("Row filter column
\"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
To something like:
ereport(ERROR,
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
errmsg("cannot add relation \"%s\" to publication because row filter
column \"%s\" does not have a replica identity and publishes
deletes/updates",
RelationGetRelationName(context->rel), colname),
errhint("To enable deleting/updating from the table, set REPLICA
IDENTITY using ALTER TABLE")));
Regards,
Vignesh
On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith <smithpb2250@gmail.com>
wrote:PSA new set of v40* patches.
Few comments:
1) When a table is added to the publication, replica identity is checked. But
while modifying the publish action to include delete/update, replica identity is
not checked for the existing tables. I felt it should be checked for the existing
tables too.
In addition to this, I think we might also need some check to prevent user from
changing the REPLICA IDENTITY index which is used in the filter expression.
I was thinking is it possible do the check related to REPLICA IDENTITY in
function CheckCmdReplicaIdentity() or In GetRelationPublicationActions(). If we
move the REPLICA IDENTITY check to this function, it would be consistent with
the existing behavior about the check related to REPLICA IDENTITY(see the
comments in CheckCmdReplicaIdentity) and seems can cover all the cases
mentioned above.
Another comment about v40-0001 patch:
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ relname)));
+ pfree(relname);
Maybe we can do the error check before table_close(), so that we don't need to
invoke pstrdup() and pfree().
Best regards,
Hou zj
On Thu, Nov 18, 2021 at 11:02 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
4. I think we should add some comments in pgoutput_row_filter() as to
why we are caching the row_filter here instead of
get_rel_sync_entry()? That has been discussed multiple times so it is
better to capture that in comments.Added comment in v40 [1]
I think apart from truncate and error cases, it can also happen for
other operations because we decide whether to publish a change
(operation) after calling get_rel_sync_entry() in pgoutput_change. I
think we can reflect that as well in the comment.
5. Why do you need a separate variable rowfilter_valid to indicate
whether a valid row filter exists? Why exprstate is not sufficient?
Can you update comments to indicate why we need this variable
separately?I have improved the (existing) comment in v40 [1].
One more thing related to this code:
pgoutput_row_filter()
{
..
+ if (!entry->rowfilter_valid)
{
..
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
..
}
Why do we need to initialize scantuple here unless we are sure that
the row filter is going to get associated with this relentry? I think
when there is no row filter then this allocation is not required.
--
With Regards,
Amit Kapila.
On Tue, Nov 23, 2021 at 1:29 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith <smithpb2250@gmail.com>
wrote:PSA new set of v40* patches.
Few comments:
1) When a table is added to the publication, replica identity is checked. But
while modifying the publish action to include delete/update, replica identity is
not checked for the existing tables. I felt it should be checked for the existing
tables too.In addition to this, I think we might also need some check to prevent user from
changing the REPLICA IDENTITY index which is used in the filter expression.I was thinking is it possible do the check related to REPLICA IDENTITY in
function CheckCmdReplicaIdentity() or In GetRelationPublicationActions(). If we
move the REPLICA IDENTITY check to this function, it would be consistent with
the existing behavior about the check related to REPLICA IDENTITY(see the
comments in CheckCmdReplicaIdentity) and seems can cover all the cases
mentioned above.
Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
would cover all the above cases but I think that would put a premium
on each update/delete operation. I think traversing the expression
tree (it could be multiple traversals if the relation is part of
multiple publications) during each update/delete would be costly.
Don't you think so?
--
With Regards,
Amit Kapila.
Attaching a new patchset v41 which includes changes by both Peter and myself.
Patches v40-0005 and v40-0006 have been merged to create patch
v41-0005 which reduces the patches to 6 again.
This patch-set contains changes addressing the following review comments:
On Mon, Nov 15, 2021 at 5:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
What I meant was that with this new code we have regressed the old
behavior. Basically, imagine a case where no filter was given for any
of the tables. Then after the patch, we will remove all the old tables
whereas before the patch it will remove the oldrels only when they are
not specified as part of new rels. If you agree with this, then we can
retain the old behavior and for the new tables, we can always override
the where clause for a SET variant of command.
Fixed and modified the behaviour to match with what the schema patch
implemented.
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
2. + * The OptWhereClause (row-filter) must be stored here + * but it is valid only for tables. If the ColId was + * mistakenly not a table this will be detected later + * in preprocess_pubobj_list() and an error thrown./error thrown/error is thrown
Fixed.
:
6. In rowfilter_expr_checker(), the expression tree is traversed
twice, can't we traverse it once to detect all non-allowed stuff? It
can be sometimes costly to traverse the tree multiple times especially
when the expression is complex and it doesn't seem acceptable to do so
unless there is some genuine reason for the same.
Fixed.
On Tue, Nov 16, 2021 at 7:24 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
doc/src/sgml/ref/create_publication.sgml
(1) improve comment
+ /* Set up a pstate to parse with */"pstate" is the variable name, better to use "ParseState".
Fixed.
src/test/subscription/t/025_row_filter.pl
(2) rename TAP test 025 to 026
I suggest that the t/025_row_filter.pl TAP test should be renamed to
026 now because 025 is being used by some schema TAP test.
Fixed
On Tue, Nov 16, 2021 at 7:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
---
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent.I think this comment is not correct, I think the correct statement
would be "only data that satisfies the row filters is pulled by the
subscriber"
Fixed
I think this message is not correct, because for update also we can
not have filters on the non-key attribute right? Even w.r.t the first
patch also if the non update non key toast columns are there we can
not apply filters on those. So this comment seems misleading to me.
Fixed
- Oid relid = RelationGetRelid(targetrel->relation); .. + relid = RelationGetRelid(targetrel); +Why this change is required, I mean instead of fetching the relid
during the variable declaration why do we need to do it separately
now?
Fixed
+ if (expr == NULL) + ereport(ERROR, + (errcode(ERRCODE_CANNOT_COERCE), + errmsg("row filter returns type %s that cannot be coerced to the expected type %s",Instead of "coerced to" can we use "cast to"? That will be in sync
with other simmilar kind od user exposed error message.
----
Fixed
I can see the caller of this function is already switching to
CacheMemoryContext, so what is the point in doing it again here?
Maybe if called is expected to do show we can Asssert on the
CurrentMemoryContext.
Fixed.
On Thu, Nov 18, 2021 at 9:36 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
(2) missing test case
It seems that the current tests are not testing the
multiple-row-filter case (n_filters > 1) in the following code in
pgoutput_row_filter_init():rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) :
linitial(rfnodes);I think a test needs to be added similar to the customers+countries
example that Tomas gave (where there is a single subscription to
multiple publications of the same table, each of which has a
row-filter).
Test case added.
On Fri, Nov 19, 2021 at 4:15 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
I notice that in the 0001 patch, it adds a "relid" member to the
PublicationRelInfo struct:src/include/catalog/pg_publication.h
typedef struct PublicationRelInfo
{
+ Oid relid;
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;It appears that this new member is not actually required, as the relid
can be simply obtained from the existing "relation" member - using the
RelationGetRelid() macro.
Fixed.
On Mon, Nov 22, 2021 at 12:44 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
Another thing I noticed was in the 0004 patch, list_free_deep() should
be used instead of list_free() in the following code block, otherwise
the rfnodes themselves (allocated by stringToNode()) are not freed:src/backend/replication/pgoutput/pgoutput.c
+ if (rfnodes) + { + list_free(rfnodes); + rfnodes = NIL; + }
Fixed.
We will be addressing the rest of the comments in the next patch.
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v41-0003-PS-ExprState-cache-modifications.patchapplication/octet-stream; name=v41-0003-PS-ExprState-cache-modifications.patchDownload
From 7c8e8a9c89b9fb7c10dc3a3ec560e3c21c0c1a8f Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Mon, 22 Nov 2021 20:26:50 -0500
Subject: [PATCH v41 3/6] PS - ExprState cache modifications.
Now the cached row-filters (e.g. ExprState *) are invalidated only in
rel_sync_cache_relation_cb function, so it means the ALTER PUBLICATION for one
table should not cause row-filters of other tables to also become invalidated.
Also all code related to caching row-filters has been removed from the
get_rel_sync_entry function and is now done just before they are needed in the
pgoutput_row_filter function.
If there are multiple publication filters for a given table these are are all
combined into a single filter.
Author: Peter Smith, Greg Nancarrow
Changes are based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
src/backend/replication/pgoutput/pgoutput.c | 214 ++++++++++++++++++----------
1 file changed, 139 insertions(+), 75 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3643684..fd024d4 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1,4 +1,4 @@
-/*-------------------------------------------------------------------------
+/*------------------------------------------------------------------------
*
* pgoutput.c
* Logical Replication output plugin
@@ -21,6 +21,7 @@
#include "executor/executor.h"
#include "fmgr.h"
#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
#include "optimizer/optimizer.h"
#include "parser/parse_coerce.h"
#include "replication/logical.h"
@@ -123,7 +124,16 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
- List *exprstate; /* ExprState for row filter */
+
+ /*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
@@ -161,7 +171,7 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static EState *create_estate_for_relation(Relation rel);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
/*
@@ -731,20 +741,118 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
{
EState *estate;
ExprContext *ecxt;
ListCell *lc;
bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. So the decision was to defer
+ * this logic to last moment when we know it will be needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
/* Bail out if there is no row filter */
- if (entry->exprstate == NIL)
+ if (!entry->exprstate)
return true;
elog(DEBUG3, "table \"%s.%s\" has row filter",
- get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
- get_rel_name(relation->rd_id));
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
PushActiveSnapshot(GetTransactionSnapshot());
@@ -757,20 +865,13 @@ pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, R
ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
/*
- * If the subscription has multiple publications and the same table has a
- * different row filter in these publications, all row filters must be
- * matched in order to replicate this change.
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
*/
- foreach(lc, entry->exprstate)
+ if (entry->exprstate)
{
- ExprState *exprstate = (ExprState *) lfirst(lc);
-
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
-
- /* If the tuple does not match one of the row filters, bail out */
- if (!result)
- break;
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
}
/* Cleanup allocated resources */
@@ -840,7 +941,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
break;
/*
@@ -873,7 +974,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -907,7 +1008,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1321,10 +1422,11 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
- entry->exprstate = NIL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1344,7 +1446,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
- TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1358,22 +1459,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
publications_valid = true;
}
- /* Release tuple table slot */
- if (entry->scantuple != NULL)
- {
- ExecDropSingleTupleTableSlot(entry->scantuple);
- entry->scantuple = NULL;
- }
-
- /*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
- */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- MemoryContextSwitchTo(oldctx);
-
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1383,9 +1468,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
{
Publication *pub = lfirst(lc);
bool publish = false;
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
if (pub->alltables)
{
@@ -1449,33 +1531,6 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- /*
- * Cache row filter, if available. All publication-table mappings
- * must be checked. If it is a partition and pubviaroot is true,
- * use the row filter of the topmost partitioned table instead of
- * the row filter of its own partition.
- */
- rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
- if (HeapTupleIsValid(rftuple))
- {
- rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
-
- if (!rfisnull)
- {
- Node *rfnode;
- ExprState *exprstate;
-
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
-
- /* Prepare for expression execution */
- exprstate = pgoutput_row_filter_init_expr(rfnode);
- entry->exprstate = lappend(entry->exprstate, exprstate);
- MemoryContextSwitchTo(oldctx);
- }
-
- ReleaseSysCache(rftuple);
- }
}
list_free(pubids);
@@ -1582,6 +1637,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1622,12 +1692,6 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
-
- if (entry->exprstate != NIL)
- {
- list_free_deep(entry->exprstate);
- entry->exprstate = NIL;
- }
}
MemoryContextSwitchTo(oldctx);
--
1.8.3.1
v41-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchapplication/octet-stream; name=v41-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patchDownload
From 8cd6e5334318f513f654f3b2ed727e22a9933b32 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Mon, 22 Nov 2021 20:20:08 -0500
Subject: [PATCH v41 2/6] PS - Add tab auto-complete support for the Row Filter
WHERE.
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
---
src/bin/psql/tab-complete.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f724e4..8c7fe7d 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2757,10 +2765,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v41-0005-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v41-0005-PS-Row-filter-validation-walker.patchDownload
From b0d560846a200f80db430151a4667584735c1303 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Mon, 22 Nov 2021 22:00:50 -0500
Subject: [PATCH v41 5/6] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and it validates that any columns referenced in the filter
expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
~~
Test code and PG docs are also updated.
Author: Peter Smith
---
doc/src/sgml/ref/create_publication.sgml | 5 +-
src/backend/catalog/pg_publication.c | 178 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/test/regress/expected/publication.out | 134 +++++++++++++++++++---
src/test/regress/sql/publication.sql | 98 +++++++++++++++-
src/test/subscription/t/026_row_filter.pl | 7 +-
9 files changed, 405 insertions(+), 33 deletions(-)
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 07e714b..98bf1fb 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -231,8 +231,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
The <literal>WHERE</literal> clause should contain only columns that are
- part of the primary key or be covered by <literal>REPLICA
- IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> operations will not
be replicated. That's because old row is used and it only contains primary
key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 3ffec3a..eb653c4 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -219,10 +221,177 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+typedef struct {
+ Relation rel;
+ bool check_replident;
+ Bitmapset *bms_replident;
+}
+rf_context;
+
+/*
+ * The row filte walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" then only columns that
+ * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
+ * row-filter WHERE clause.
*/
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete", check that filter cols are also valid replica identity
+ * cols.
+ *
+ * TODO - check later for publish "update" case.
+ */
+ if (pub->pubactions.pubdelete)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,10 +484,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6959675..fdf7659 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,19 +355,31 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -387,6 +401,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 40198fc..c7160bd 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,13 +184,23 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -208,6 +220,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
1.8.3.1
v41-0004-PS-Combine-multiple-filters-with-OR-instead-of-A.patchapplication/octet-stream; name=v41-0004-PS-Combine-multiple-filters-with-OR-instead-of-A.patchDownload
From 77dfa2d4ed8ae230aabbd2add1680b825c84d0b1 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Mon, 22 Nov 2021 21:27:36 -0500
Subject: [PATCH v41 4/6] PS - Combine multiple filters with OR instead of AND.
This is a change of behavior requested by Tomas [1]. The subscription now is
treated "as a union of all the publications" so the filters are combined with
OR instead of AND.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Updated documentation.
Added more test cases.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
---
doc/src/sgml/ref/create_subscription.sgml | 27 +++++++----
src/backend/replication/logical/tablesync.c | 27 +++++++++--
src/backend/replication/pgoutput/pgoutput.c | 25 ++++++++++-
src/test/subscription/t/026_row_filter.pl | 69 ++++++++++++++++++++++++++---
4 files changed, 128 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 5a9430e..42bf8c2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,15 +206,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>. If any table in the
- publications has a <literal>WHERE</literal> clause, rows that do not
- satisfy the <replaceable class="parameter">expression</replaceable>
- will not be copied. If the subscription has several publications in
- which a table has been published with different
- <literal>WHERE</literal> clauses, rows must satisfy all expressions
- to be copied. If the subscriber is a
- <productname>PostgreSQL</productname> version before 15 then any row
- filtering is ignored.
+ The default is <literal>true</literal>.
+ </para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
</para>
</listitem>
</varlistentry>
@@ -327,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 9d86a10..e9b7f7c 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -838,6 +838,13 @@ fetch_remote_table_info(char *nspname, char *relname,
(errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
nspname, relname, res->err)));
+ /*
+ * Multiple row-filter expressions for the same publication will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
@@ -847,6 +854,20 @@ fetch_remote_table_info(char *nspname, char *relname,
*qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
}
ExecDropSingleTupleTableSlot(slot);
@@ -896,7 +917,7 @@ copy_table(Relation rel)
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
* do SELECT * because we need to not copy generated columns. For
- * tables with any row filters, build a SELECT query with AND'ed row
+ * tables with any row filters, build a SELECT query with OR'ed row
* filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
@@ -908,7 +929,7 @@ copy_table(Relation rel)
}
appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
- /* list of AND'ed filters */
+ /* list of OR'ed filters */
if (qual != NIL)
{
ListCell *lc;
@@ -922,7 +943,7 @@ copy_table(Relation rel)
if (first)
first = false;
else
- appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, " OR ");
appendStringInfoString(&cmd, q);
}
list_free_deep(qual);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index fd024d4..b332057 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -797,6 +797,11 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
* NOTE: If the relation is a partition and pubviaroot is true, use
* the row filter of the topmost partitioned table instead of the row
* filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
*/
foreach(lc, data->publications)
{
@@ -825,12 +830,28 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
ReleaseSysCache(rftuple);
+
+ if (rfisnull)
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ break;
+ }
}
} /* loop all subscribed publications */
/*
- * Combine all the row-filters (if any) into a single filter, and then build the ExprState for it
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
*/
n_filters = list_length(rfnodes);
if (n_filters > 0)
@@ -838,7 +859,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
Node *rfnode;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes, -1) : linitial(rfnodes);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
MemoryContextSwitchTo(oldctx);
}
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index e806b5d..64e71d0 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 7;
+use Test::More tests => 10;
# create publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
@@ -23,6 +23,8 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
);
$node_publisher->safe_psql('postgres',
@@ -45,6 +47,8 @@ $node_subscriber->safe_psql('postgres',
$node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
);
$node_subscriber->safe_psql('postgres',
@@ -86,6 +90,13 @@ $node_publisher->safe_psql('postgres',
"CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
#
# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
# SQL commands are for testing the initial data copy using logical replication.
@@ -103,6 +114,8 @@ $node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
# insert data into partitioned table and directly on the partition
$node_publisher->safe_psql('postgres',
@@ -115,7 +128,7 @@ $node_publisher->safe_psql('postgres',
my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
my $appname = 'tap_sub';
$node_subscriber->safe_psql('postgres',
- "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
);
$node_publisher->wait_for_catchup($appname);
@@ -143,14 +156,26 @@ is( $result, qq(1001|test 1001
# Check expected replicated rows for tab_rowfilter_2
# tap_pub_1 filter is: (c % 2 = 0)
# tap_pub_2 filter is: (c % 3 = 0)
-# When there are multiple publications for the same table, all filter
-# expressions should succeed. In this case, rows are replicated if c value is
-# divided by 2 AND 3 (6, 12, 18).
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
#
$result =
$node_subscriber->safe_psql('postgres',
"SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
-is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
# Check expected replicated rows for tab_rowfilter_3
# There is no filter. 10 rows are inserted, so 10 rows are replicated.
@@ -210,9 +235,41 @@ $node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
$node_publisher->safe_psql('postgres',
"DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
$node_publisher->wait_for_catchup($appname);
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
# Check expected replicated rows for tab_rowfilter_1
# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
#
--
1.8.3.1
v41-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v41-0001-Row-filter-for-logical-replication.patchDownload
From 4596907b95a23ec0f772abd55b087361ad473c8e Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Fri, 19 Nov 2021 06:12:04 -0500
Subject: [PATCH v41 1/6] Row filter for logical replication.
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows must satisfy all expressions to be copied. If subscriber is a pre-15 version,
data synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 ++-
doc/src/sgml/ref/create_subscription.sgml | 10 +-
src/backend/catalog/pg_publication.c | 48 ++++-
src/backend/commands/publicationcmds.c | 80 ++++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 +++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 95 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 257 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 3 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++++
src/test/regress/sql/publication.sql | 75 +++++++
src/test/subscription/t/026_row_filter.pl | 300 ++++++++++++++++++++++++++++
25 files changed, 1156 insertions(+), 55 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/026_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 4aeb0c8..07e714b 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +259,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +277,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..5a9430e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -206,7 +206,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<para>
Specifies whether to copy pre-existing data in the publications
that are being subscribed to when the replication starts.
- The default is <literal>true</literal>.
+ The default is <literal>true</literal>. If any table in the
+ publications has a <literal>WHERE</literal> clause, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be copied. If the subscription has several publications in
+ which a table has been published with different
+ <literal>WHERE</literal> clauses, rows must satisfy all expressions
+ to be copied. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row
+ filtering is ignored.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2..3ffec3a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -257,18 +260,22 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +296,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +333,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +355,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..c9ad079 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,61 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
-
- /* Calculate which relations to drop. */
+ /*
+ * Remove tables that are not found in the new table list and those
+ * tables which have a qual expression. The qual expression could be
+ * in the old table list or in the new table list.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
+
+ /*
+ * If the new relation or the old relation has a where clause,
+ * we need to remove it so that it can be added afresh later.
+ */
+ if (RelationGetRelid(newpubrel->relation) == oldrelid &&
+ newpubrel->whereClause == NULL && rfisnull)
{
found = true;
break;
}
}
- /* Not yet in the list, open it and add to the list */
+
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +920,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +948,30 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1004,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1013,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1033,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1039,10 +1081,11 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
{
PublicationRelInfo *pub_rel = (PublicationRelInfo *) lfirst(lc);
Relation rel = pub_rel->relation;
+ Oid relid = RelationGetRelid(rel);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -1088,6 +1131,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index ad1ea2f..e3b0039 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4830,6 +4830,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index a6d0cef..4d0616b
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17341,7 +17359,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17354,6 +17373,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..9d86a10 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,59 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +866,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +875,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +886,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with AND'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +906,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of AND'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " AND ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..3643684 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,27 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -114,6 +123,8 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ List *exprstate; /* ExprState for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -137,7 +148,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +157,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +639,149 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (entry->exprstate == NIL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * If the subscription has multiple publications and the same table has a
+ * different row filter in these publications, all row filters must be
+ * matched in order to replicate this change.
+ */
+ foreach(lc, entry->exprstate)
+ {
+ ExprState *exprstate = (ExprState *) lfirst(lc);
+
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
+
+ /* If the tuple does not match one of the row filters, bail out */
+ if (!result)
+ break;
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +808,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +832,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +839,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +872,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +906,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +975,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1297,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1141,6 +1323,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->replicate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NIL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1160,6 +1344,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
List *schemaPubids = GetSchemaPublications(schemaId);
ListCell *lc;
Oid publish_as_relid = relid;
+ TupleDesc tupdesc = RelationGetDescr(relation);
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -1173,6 +1358,22 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
publications_valid = true;
}
+ /* Release tuple table slot */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
/*
* Build publication cache. We can't use one provided by relcache as
* relcache considers all publications given relation is in, but here
@@ -1182,6 +1383,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
{
Publication *pub = lfirst(lc);
bool publish = false;
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
if (pub->alltables)
{
@@ -1245,9 +1449,33 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
+ /*
+ * Cache row filter, if available. All publication-table mappings
+ * must be checked. If it is a partition and pubviaroot is true,
+ * use the row filter of the topmost partitioned table instead of
+ * the row filter of its own partition.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+ ExprState *exprstate;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+
+ /* Prepare for expression execution */
+ exprstate = pgoutput_row_filter_init_expr(rfnode);
+ entry->exprstate = lappend(entry->exprstate, exprstate);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
}
list_free(pubids);
@@ -1365,6 +1593,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1603,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1391,7 +1622,15 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubupdate = false;
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
+
+ if (entry->exprstate != NIL)
+ {
+ list_free_deep(entry->exprstate);
+ entry->exprstate = NIL;
+ }
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7e98371..b404fd2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4229,6 +4229,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4239,9 +4240,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4250,6 +4258,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4290,6 +4299,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4360,8 +4373,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..bd0d4ce 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -86,6 +86,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +123,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558..6959675 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435..40198fc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
new file mode 100644
index 0000000..e806b5d
--- /dev/null
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -0,0 +1,300 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 7;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, all filter
+# expressions should succeed. In this case, rows are replicated if c value is
+# divided by 2 AND 3 (6, 12, 18).
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(3|6|18), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v41-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v41-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From ecd6b8cb177f3bec7eecdaf40f1874415bcd8aad Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 23 Nov 2021 04:49:26 -0500
Subject: [PATCH v41 6/6] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
doc/src/sgml/ref/create_publication.sgml | 2 +-
src/backend/catalog/pg_publication.c | 8 +-
src/backend/replication/logical/proto.c | 35 ++--
src/backend/replication/pgoutput/pgoutput.c | 243 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/026_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
8 files changed, 266 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 98bf1fb..f067125 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -233,7 +233,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
The <literal>WHERE</literal> clause should contain only columns that are
covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
- <command>DELETE</command> operations will not
+ <command>DELETE</command> or <command>UPDATE</command> operations will not
be replicated. That's because old row is used and it only contains primary
key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index eb653c4..bf8384b 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -340,7 +340,7 @@ rowfilter_walker(Node *node, rf_context *context)
* 1. Only certain simple node types are permitted in the expression. See
* function rowfilter_walker for details.
*
- * 2. If the publish operation contains "delete" then only columns that
+ * 2. If the publish operation contains "delete" or "delete" then only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
*/
@@ -352,12 +352,10 @@ rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
context.rel = rel;
/*
- * For "delete", check that filter cols are also valid replica identity
+ * For "delete" or "update", check that filter cols are also valid replica identity
* cols.
- *
- * TODO - check later for publish "update" case.
*/
- if (pub->pubactions.pubdelete)
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
{
char replica_identity = rel->rd_rel->relreplident;
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index b332057..5492bf4 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -134,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -169,10 +172,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -736,17 +745,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -774,7 +878,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -783,9 +887,28 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -866,6 +989,67 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * The row is passed in as a virtual slot.
+ *
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(entry->relid)),
+ get_rel_name(entry->relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -896,7 +1080,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -954,6 +1137,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -962,7 +1148,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -993,9 +1179,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1018,8 +1205,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1029,7 +1235,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1447,6 +1653,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8e..2f41eac 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2194,6 +2194,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
In 0003 patch, why is below change required?
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1,4 +1,4 @@
-/*-------------------------------------------------------------------------
+/*------------------------------------------------------------------------
*
* pgoutput.c
I suggest at this stage we can combine 0001, 0003, and 0004. Then move
pg_dump and psql (describe.c) related changes to 0002 and make 0002 as
the last patch in the series. This will help review backend changes
first and then we can look at client-side changes.
After above, rearrange the code in pgoutput_row_filter(), so that two
different checks related to 'rfisnull' (introduced by different
patches) can be combined as if .. else check.
--
With Regards,
Amit Kapila.
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
Few comments on v41-0002 patch:
1) Tab completion should be handled for completion of "WITH(" in
"create publication pub1 for table t1 where (c1 > 10)":
@@ -2757,10 +2765,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR",
"ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR",
"TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" -
complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) &&
TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
2) Tab completion completes with "WHERE (" in case of "alter
publication pub1 add table t1,":
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD",
"TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
Should this be changed to:
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD",
"TABLE", MatchAny) && (!ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
Regards,
Vignesh
On Tues, Nov 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 23, 2021 at 1:29 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith <smithpb2250@gmail.com>
wrote:PSA new set of v40* patches.
Few comments:
1) When a table is added to the publication, replica identity is checked. But
while modifying the publish action to include delete/update, replica identity is
not checked for the existing tables. I felt it should be checked for the existing
tables too.In addition to this, I think we might also need some check to prevent user from
changing the REPLICA IDENTITY index which is used in the filter expression.I was thinking is it possible do the check related to REPLICA IDENTITY in
function CheckCmdReplicaIdentity() or In GetRelationPublicationActions(). If we
move the REPLICA IDENTITY check to this function, it would be consistent with
the existing behavior about the check related to REPLICA IDENTITY(see the
comments in CheckCmdReplicaIdentity) and seems can cover all the cases
mentioned above.Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
would cover all the above cases but I think that would put a premium
on each update/delete operation. I think traversing the expression
tree (it could be multiple traversals if the relation is part of
multiple publications) during each update/delete would be costly.
Don't you think so?
Yes, I agreed that traversing the expression every time would be costly.
I thought maybe we can cache the columns used in row filter or cache only the a
flag(can_update|delete) in the relcache. I think every operation that affect
the row-filter or replica-identity will invalidate the relcache and the cost of
check seems acceptable with the cache.
The reason that I thought it might be better do check in
CheckCmdReplicaIdentity is that we might need to add duplicate check code for
a couple of places otherwise, for example, we might need to check
replica-identity when:
[ALTER REPLICA IDENTITY |
DROP INDEX |
ALTER PUBLICATION ADD TABLE |
ALTER PUBLICATION SET (pubaction)]
Best regards,
Hou zj
On Wed, Nov 24, 2021 at 6:51 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tues, Nov 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 23, 2021 at 1:29 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith <smithpb2250@gmail.com>
wrote:PSA new set of v40* patches.
Few comments:
1) When a table is added to the publication, replica identity is checked. But
while modifying the publish action to include delete/update, replica identity is
not checked for the existing tables. I felt it should be checked for the existing
tables too.In addition to this, I think we might also need some check to prevent user from
changing the REPLICA IDENTITY index which is used in the filter expression.I was thinking is it possible do the check related to REPLICA IDENTITY in
function CheckCmdReplicaIdentity() or In GetRelationPublicationActions(). If we
move the REPLICA IDENTITY check to this function, it would be consistent with
the existing behavior about the check related to REPLICA IDENTITY(see the
comments in CheckCmdReplicaIdentity) and seems can cover all the cases
mentioned above.Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
would cover all the above cases but I think that would put a premium
on each update/delete operation. I think traversing the expression
tree (it could be multiple traversals if the relation is part of
multiple publications) during each update/delete would be costly.
Don't you think so?Yes, I agreed that traversing the expression every time would be costly.
I thought maybe we can cache the columns used in row filter or cache only the a
flag(can_update|delete) in the relcache. I think every operation that affect
the row-filter or replica-identity will invalidate the relcache and the cost of
check seems acceptable with the cache.
I think if we can cache this information especially as a bool flag
then that should probably be better.
--
With Regards,
Amit Kapila.
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
Patches v40-0005 and v40-0006 have been merged to create patch
v41-0005 which reduces the patches to 6 again.
Few comments:
1) I'm not sure if we will be able to throw a better error message in
this case "ERROR: missing FROM-clause entry for table "t4"", if
possible you could change it.
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a pstate to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+
AccessShareLock,
+
NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+
copyObject(pri->whereClause),
+
EXPR_KIND_PUBLICATION_WHERE,
+
"PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
alter publication pub1 add table t5 where ( t4.c1 = 10);
ERROR: missing FROM-clause entry for table "t4"
LINE 1: alter publication pub1 add table t5 where ( t4.c1 = 10);
^
pstate->p_expr_kind is stored as EXPR_KIND_PUBLICATION_WHERE, we could
differentiate using expr_kind.
2) Should '"delete" or "delete"' be '"delete" or "update"'
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -340,7 +340,7 @@ rowfilter_walker(Node *node, rf_context *context)
* 1. Only certain simple node types are permitted in the expression. See
* function rowfilter_walker for details.
*
- * 2. If the publish operation contains "delete" then only columns that
+ * 2. If the publish operation contains "delete" or "delete" then
only columns that
* are allowed by the REPLICA IDENTITY rules are permitted to be used in the
* row-filter WHERE clause.
*/
@@ -352,12 +352,10 @@ rowfilter_expr_checker(Publication *pub,
Relation rel, Node *rfnode)
context.rel = rel;
/*
- * For "delete", check that filter cols are also valid replica identity
+ * For "delete" or "update", check that filter cols are also
valid replica identity
* cols.
3) Should we include row filter condition in pg_publication_tables
view like in describe publication(\dRp+) , since the prqual is not
easily readable in pg_publication_rel table:
select * from pg_publication_tables ;
pubname | schemaname | tablename
---------+------------+-----------
pub1 | public | t1
(1 row)
select * from pg_publication_rel ;
oid | prpubid | prrelid |
prqual
-------+---------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
16389 | 16388 | 16384 | {OPEXPR :opno 518 :opfuncid 144
:opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args
({VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0
:varlevelsup 0 :va
rnosyn 1 :varattnosyn 1 :location 45} {CONST :consttype 23
:consttypmod -1 :constcollid 0 :constlen 4 :constbyval true
:constisnull false :location 51 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]})
:location 48}
(1 row)
4) This should be included in typedefs.list, also we could add some
comments for this structure
+typedef struct {
+ Relation rel;
+ Bitmapset *bms_replident;
+}
+rf_context;
5) Few includes are not required. #include "miscadmin.h" not required
in pg_publication.c, #include "executor/executor.h" not required in
proto.c, #include "access/xact.h", #include "executor/executor.h" and
#include "replication/logicalrelation.h" not required in pgoutput.c
6) typo "filte" should be "filter":
+/*
+ * The row filte walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
Regards,
Vignesh
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
Patches v40-0005 and v40-0006 have been merged to create patch
v41-0005 which reduces the patches to 6 again.
This patch-set contains changes addressing the following review comments:On Mon, Nov 15, 2021 at 5:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
What I meant was that with this new code we have regressed the old
behavior. Basically, imagine a case where no filter was given for any
of the tables. Then after the patch, we will remove all the old tables
whereas before the patch it will remove the oldrels only when they are
not specified as part of new rels. If you agree with this, then we can
retain the old behavior and for the new tables, we can always override
the where clause for a SET variant of command.Fixed and modified the behaviour to match with what the schema patch
implemented.
+
+ /*
+ * If the new relation or the old relation has a where clause,
+ * we need to remove it so that it can be added afresh later.
+ */
+ if (RelationGetRelid(newpubrel->relation) == oldrelid &&
+ newpubrel->whereClause == NULL && rfisnull)
Can't we use _equalPublicationTable() here? It compares the whereClause as well.
Few more comments:
=================
0001
1.
@@ -1039,10 +1081,11 @@ PublicationAddTables(Oid pubid, List *rels,
bool if_not_exists,
{
PublicationRelInfo *pub_rel = (PublicationRelInfo *) lfirst(lc);
Relation rel = pub_rel->relation;
+ Oid relid = RelationGetRelid(rel);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(relid, GetUserId()))
Here, you can directly use RelationGetRelid as was used in the
previous code without using an additional variable.
0005
2.
+typedef struct {
+ Relation rel;
+ bool check_replident;
+ Bitmapset *bms_replident;
+}
+rf_context;
Add rf_context in the same line where } ends.
3. In the function header comment of rowfilter_walker, you mentioned
the simple expressions allowed but we should write why we are doing
so. It has been discussed in detail in various emails in this thread.
AFAIR, below are the reasons:
A. We don't want to allow user-defined functions or operators because
(a) if the user drops such a function/operator or if there is any
other error via that function, the walsender won't be able to recover
from such an error even if we fix the function's problem because it
uses a historic snapshot to access row-filter; (b) any other table
could be accessed via a function which won't work because of historic
snapshots in logical decoding environment.
B. We don't allow anything other immutable built-in functions as those
can access database and would lead to the problem (b) mentioned in the
previous paragraph.
Don't we need to check for user-defined types similar to user-defined
functions and operators? If not why?
4.
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or
It seems Var Op Var is allowed. I tried below and it works:
create publication pub for table t1 where (c1 < c2) WITH (publish = 'insert');
I think it should be okay to allow it provided we ensure that we never
access some other table/view etc. as part of the expression. Also, we
should document the behavior correctly.
--
With Regards,
Amit Kapila.
Thanks for all the review comments so far! We are endeavouring to keep
pace with them.
All feedback is being tracked and we will fix and/or reply to everything ASAP.
Meanwhile, PSA the latest set of v42* patches.
This version was mostly a patch restructuring exercise but it also
addresses some minor review comments in passing.
~~
Patches have been merged and rearranged based on Amit's suggestions
[Amit 23/11].
BEFORE:
v41-0001 Euler's main patch
v41-0002 Tab-complete
v41-0003 ExprState cache
v41-0004 OR/AND
v41-0005 Validation walker
v41-0006 new/old tuple updates
AFTER:
v42-0001 main patch <== v41-0001 + v41-0003 + v41-0004
v42-0002 validation walker <== v41-0005
v42-0003 new/old tuple updates <== v41-0006
v42-0004 tab-complete and pgdump <== v41-0002 (plus pgdump code from v41-0001)
~~
Some review comments were addressed as follows:
v42-0001 main patch
- improve comments about caching [Amit 15/Nov] #4.
- fix comment typo [Tang 23/11]
v42-0002 validation walker
- fix comment typo [Vignesh 24/11] #2
- add comment for rf_context [Vignesh 24/11] #4
- fix comment typo [Vignesh 24/11] #6
- code formatting [Amit 24/11] #2
v42-0003 new/old tuple
- fix compilation warning [Greg 18/11] #1
v42-0004 tab-complete and pgdump
- NA
------
[Amit 15/11] /messages/by-id/CAA4eK1L4ddTpc=-3bq==U8O-BJ=svkAFefRDpATKCG4hKYKAig@mail.gmail.com
[Amit 23/11] /messages/by-id/CAA4eK1+7R_=LFXHvfjjR88m3oTLYeLV=2zdAZEH3n7n8nhj==w@mail.gmail.com
[Tang 23/11] /messages/by-id/OS0PR01MB611389E3A5685B53930A4833FB609@OS0PR01MB6113.jpnprd01.prod.outlook.com
[Vignesh 24/11]
/messages/by-id/CALDaNm08Ynr_FzNg+doHj=_nBet+KZAvNbqmkEEw7M2SPpPEAw@mail.gmail.com
[Amit 24/11] /messages/by-id/CAA4eK1+Xd=kM5D3jtXyN+W7J+wU-yyQAdyq66a6Wcq_PKRTbSw@mail.gmail.com
[Greg 18/11] /messages/by-id/CAJcOf-fcDRsC4MYv2ZpUwFe68tPchbM-0fpb2z5ks=yLKDH2-g@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v42-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v42-0002-PS-Row-filter-validation-walker.patchDownload
From 59f16f07590d79751620a1effe4b8e0f1469168f Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 25 Nov 2021 10:44:15 +1100
Subject: [PATCH v42] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
~~
Test code and PG docs are also updated.
Author: Peter Smith
---
src/backend/catalog/pg_publication.c | 176 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/test/regress/expected/publication.out | 134 ++++++++++++++++++++---
src/test/regress/sql/publication.sql | 98 ++++++++++++++++-
src/test/subscription/t/026_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 401 insertions(+), 31 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 3ffec3a..ae781cd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -219,10 +221,175 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident;
+} rf_context;
+
+/*
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
*/
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,10 +482,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6959675..fdf7659 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,19 +355,31 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -387,6 +401,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 40198fc..c7160bd 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,13 +184,23 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -208,6 +220,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8e..0d5682b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3497,6 +3497,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v42-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v42-0001-Row-filter-for-logical-replication.patchDownload
From 882de4c6f8dafe5ca43ecce9d2b74d2d5f70d826 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 24 Nov 2021 16:17:59 +1100
Subject: [PATCH v42] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause must contain only columns that are covered by REPLICA
IDENTITY, or are part of the primary key (when REPLICA IDENTITY is not set),
otherwise DELETE or UPDATE operations will not be replicated. That's because
old row is used and it only contains primary key or columns that are part of
the REPLICA IDENTITY; the remaining columns are NULL. For INSERT operations any
column might be used in the WHERE clause. If the row filter evaluates to NULL,
it returns false. For simplicity, functions are not allowed; this could be
addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining mupltiple row-filters
===============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 34 ++-
doc/src/sgml/ref/create_subscription.sgml | 17 ++
src/backend/catalog/pg_publication.c | 48 +++-
src/backend/commands/publicationcmds.c | 80 +++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 348 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 3 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++
src/test/regress/sql/publication.sql | 75 ++++++
src/test/subscription/t/026_row_filter.pl | 357 ++++++++++++++++++++++++++++
23 files changed, 1312 insertions(+), 51 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/026_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 4aeb0c8..851f48c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,22 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause must contain only columns that are
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> or <command>UPDATE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ operations any column might be used in the <literal>WHERE</literal> clause.
+ New row is used and it contains all columns. A <literal>NULL</literal> value
+ causes the expression to evaluate to false; avoid using columns without
+ not-null constraints in the <literal>WHERE</literal> clause. The
+ <literal>WHERE</literal> clause does not allow functions or user-defined
+ operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +260,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +278,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..42bf8c2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,10 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -319,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2..3ffec3a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -257,18 +260,22 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +296,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +333,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +355,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..c9ad079 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,61 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
-
- /* Calculate which relations to drop. */
+ /*
+ * Remove tables that are not found in the new table list and those
+ * tables which have a qual expression. The qual expression could be
+ * in the old table list or in the new table list.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
+
+ /*
+ * If the new relation or the old relation has a where clause,
+ * we need to remove it so that it can be added afresh later.
+ */
+ if (RelationGetRelid(newpubrel->relation) == oldrelid &&
+ newpubrel->whereClause == NULL && rfisnull)
{
found = true;
break;
}
}
- /* Not yet in the list, open it and add to the list */
+
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +920,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +948,30 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1004,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1013,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1033,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1039,10 +1081,11 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
{
PublicationRelInfo *pub_rel = (PublicationRelInfo *) lfirst(lc);
Relation rel = pub_rel->relation;
+ Oid relid = RelationGetRelid(rel);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -1088,6 +1131,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 7d55fd6..4bcd77c 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4831,6 +4831,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index a6d0cef..4d0616b
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17341,7 +17359,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17354,6 +17373,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..af73b14 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..a2a0ce9 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1,4 +1,4 @@
-/*-------------------------------------------------------------------------
+/*------------------------------------------------------------------------
*
* pgoutput.c
* Logical Replication output plugin
@@ -13,18 +13,28 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +126,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +158,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +167,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +649,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+
+ if (rfisnull)
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ break;
+ }
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +934,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +958,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +965,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +998,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1032,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1101,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1423,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1447,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1556,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1662,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1688,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1698,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1718,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..bd0d4ce 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -86,6 +86,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +123,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558..6959675 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435..40198fc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v42-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v42-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From ff5ae14d2d9b8358ab27d2ea5430b9b5329cab50 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 25 Nov 2021 12:40:30 +1100
Subject: [PATCH v42] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 13 ++++++++++++-
3 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5a2094d..3696ad2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4264,6 +4264,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4274,9 +4275,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4285,6 +4293,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4325,6 +4334,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4395,8 +4408,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index fa2e195..132b61f 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2768,10 +2776,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v42-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v42-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From d77fa3d2a25c3e5f14a4250af9d35e8fa3ec793d Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 25 Nov 2021 12:05:52 +1100
Subject: [PATCH v42] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 35 ++--
src/backend/replication/pgoutput/pgoutput.c | 244 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/026_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 262 insertions(+), 35 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index a2a0ce9..2c14e31 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -134,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -169,10 +172,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -736,18 +745,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -777,7 +880,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -786,9 +889,28 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -869,6 +991,67 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * The row is passed in as a virtual slot.
+ *
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(entry->relid)),
+ get_rel_name(entry->relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -900,7 +1083,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -958,6 +1140,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -966,7 +1151,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -997,9 +1182,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1022,8 +1208,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1033,7 +1238,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1451,6 +1656,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0d5682b..4496fe4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2194,6 +2194,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
On Thu, Nov 18, 2021 at 9:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
Thanks for the patch updates.
A couple of comments so far:
(1) compilation warning
WIth the patches applied, there's a single compilation warning when
Postgres is built:pgoutput.c: In function ‘pgoutput_row_filter_init’:
pgoutput.c:854:8: warning: unused variable ‘relid’ [-Wunused-variable]
Oid relid = RelationGetRelid(relation);
^~~~~
Fixed in v42* [1]/messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 23, 2021 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 18, 2021 at 11:02 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
4. I think we should add some comments in pgoutput_row_filter() as to
why we are caching the row_filter here instead of
get_rel_sync_entry()? That has been discussed multiple times so it is
better to capture that in comments.Added comment in v40 [1]
I think apart from truncate and error cases, it can also happen for
other operations because we decide whether to publish a change
(operation) after calling get_rel_sync_entry() in pgoutput_change. I
think we can reflect that as well in the comment.
Fixed in v42* [1]/messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 23, 2021 at 4:40 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Thursday, November 18, 2021 9:34 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA new set of v40* patches.
Besides, a small comment on 0004 patch:
+ * Multiple row-filter expressions for the same publication will later be + * combined by the COPY using OR, but this means if any of the filters isShould we change it to:
Multiple row-filter expressions for the same table ...
Fixed in v42* [1]/messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Nov 24, 2021 at 8:52 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
Patches v40-0005 and v40-0006 have been merged to create patch
v41-0005 which reduces the patches to 6 again.Few comments:
...
2) Should '"delete" or "delete"' be '"delete" or "update"' --- a/src/backend/catalog/pg_publication.c +++ b/src/backend/catalog/pg_publication.c @@ -340,7 +340,7 @@ rowfilter_walker(Node *node, rf_context *context) * 1. Only certain simple node types are permitted in the expression. See * function rowfilter_walker for details. * - * 2. If the publish operation contains "delete" then only columns that + * 2. If the publish operation contains "delete" or "delete" then only columns that * are allowed by the REPLICA IDENTITY rules are permitted to be used in the * row-filter WHERE clause. */ @@ -352,12 +352,10 @@ rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode) context.rel = rel;/* - * For "delete", check that filter cols are also valid replica identity + * For "delete" or "update", check that filter cols are also valid replica identity * cols.
Fixed in v42* [1]/messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
4) This should be included in typedefs.list, also we could add some comments for this structure +typedef struct { + Relation rel; + Bitmapset *bms_replident; +} +rf_context;
Fixed in v42* [1]/messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
6) typo "filte" should be "filter": +/* + * The row filte walker checks that the row filter expression is legal. + * + * Rules: Node-type validation + * --------------------------- + * Allow only simple or compound expressions like: + * - "(Var Op Const)" or + * - "(Var Op Const) Bool (Var Op Const)"
Fixed in v42* [1]/messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
...
Few more comments:
=================
...
0005 2. +typedef struct { + Relation rel; + bool check_replident; + Bitmapset *bms_replident; +} +rf_context;Add rf_context in the same line where } ends.
Fixed in v42* [1]/messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PsGZHvafa3K_RAJ0Agm28W2owjNN+qU0EUsSjBNbuXFsQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 23, 2021 at 10:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
...
I suggest at this stage we can combine 0001, 0003, and 0004. Then move
pg_dump and psql (describe.c) related changes to 0002 and make 0002 as
the last patch in the series. This will help review backend changes
first and then we can look at client-side changes.
The patch combining and reordering was as suggested.
BEFORE:
v41-0001 Euler's main patch
v41-0002 Tab-complete
v41-0003 ExprState cache
v41-0004 OR/AND
v41-0005 Validation walker
v41-0006 new/old tuple updates
AFTER:
v42-0001 main patch <== v41-0001 + v41-0003 + v41-0004
v42-0002 validation walker <== v41-0005
v42-0003 new/old tuple updates <== v41-0006
v42-0004 tab-complete and pgdump <== v41-0002 (plus pgdump code from v41-0001)
~
Please note, I did not remove the describe.c changes from the
v42-0001 patch at this time. I left this as-is because I felt the
ability for psql \d+ or \dRp+ etc to display the current row-filter is
*essential* functionality to be able to test and debug the 0001 patch
properly.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Nov 24, 2021 1:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 24, 2021 at 6:51 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 23, 2021 at 1:29 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith
<smithpb2250@gmail.com>
wrote:PSA new set of v40* patches.
Few comments:
1) When a table is added to the publication, replica identity is
checked. But while modifying the publish action to include
delete/update, replica identity is not checked for the existing
tables. I felt it should be checked for the existing tables too.In addition to this, I think we might also need some check to
prevent user from changing the REPLICA IDENTITY index which is used in
the filter expression.I was thinking is it possible do the check related to REPLICA
IDENTITY in function CheckCmdReplicaIdentity() or In
GetRelationPublicationActions(). If we move the REPLICA IDENTITY
check to this function, it would be consistent with the existing
behavior about the check related to REPLICA IDENTITY(see the
comments in CheckCmdReplicaIdentity) and seems can cover all the cases
mentioned above.Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
would cover all the above cases but I think that would put a premium
on each update/delete operation. I think traversing the expression
tree (it could be multiple traversals if the relation is part of
multiple publications) during each update/delete would be costly.
Don't you think so?Yes, I agreed that traversing the expression every time would be costly.
I thought maybe we can cache the columns used in row filter or cache
only the a
flag(can_update|delete) in the relcache. I think every operation that
affect the row-filter or replica-identity will invalidate the relcache
and the cost of check seems acceptable with the cache.I think if we can cache this information especially as a bool flag then that should
probably be better.
When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.
It's possible the table is published in two publications A and B, publication A
only publish "insert" , publication B publish "update". When UPDATE, both row
filter in A and B will be executed. Is this behavior expected?
For example:
---- Publication
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
---- Subscription
create table tbl1 (a int primary key);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
port=10000' PUBLICATION A,B;
---- Publication
update tbl1 set a = 2;
The publication can be created, and when UPDATE, the rowfilter in A (b<2) will
also been executed but the column in it is not part of replica identity.
(I am not against this behavior just confirm)
Best regards,
Hou zj
On Thu, Nov 25, 2021, at 10:39 AM, houzj.fnst@fujitsu.com wrote:
When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B, publication A
only publish "insert" , publication B publish "update". When UPDATE, both row
filter in A and B will be executed. Is this behavior expected?
Good question. No. The code should check the action before combining the
multiple row filters.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wednesday, November 24, 2021 1:46 PM Amit Kapila <amit.kapila16@gmail.com>
On Wed, Nov 24, 2021 at 6:51 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
On Tues, Nov 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Tue, Nov 23, 2021 at 1:29 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com>
wrote:
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith
<smithpb2250@gmail.com>
wrote:PSA new set of v40* patches.
Few comments:
1) When a table is added to the publication, replica identity is
checked. But while modifying the publish action to include
delete/update, replica identity is not checked for the existing
tables. I felt it should be checked for the existing tables too.In addition to this, I think we might also need some check to
prevent user from changing the REPLICA IDENTITY index which is used in
the filter expression.I was thinking is it possible do the check related to REPLICA
IDENTITY in function CheckCmdReplicaIdentity() or In
GetRelationPublicationActions(). If we move the REPLICA IDENTITY
check to this function, it would be consistent with the existing
behavior about the check related to REPLICA IDENTITY(see the
comments in CheckCmdReplicaIdentity) and seems can cover all the
cases mentioned above.Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
would cover all the above cases but I think that would put a premium
on each update/delete operation. I think traversing the expression
tree (it could be multiple traversals if the relation is part of
multiple publications) during each update/delete would be costly.
Don't you think so?Yes, I agreed that traversing the expression every time would be costly.
I thought maybe we can cache the columns used in row filter or cache
only the a
flag(can_update|delete) in the relcache. I think every operation that
affect the row-filter or replica-identity will invalidate the relcache
and the cost of check seems acceptable with the cache.I think if we can cache this information especially as a bool flag then that should
probably be better.
Based on this direction, I tried to write a top up POC patch(0005) which I'd like to share.
The top up patch mainly did the following things.
* Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.
* Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.
Also attach the v42 patch set to keep cfbot happy.
Best regards,
Hou zj
Attachments:
v42-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v42-0002-PS-Row-filter-validation-walker.patchDownload
From 59f16f07590d79751620a1effe4b8e0f1469168f Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 25 Nov 2021 10:44:15 +1100
Subject: [PATCH v42] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
~~
Test code and PG docs are also updated.
Author: Peter Smith
---
src/backend/catalog/pg_publication.c | 176 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/test/regress/expected/publication.out | 134 ++++++++++++++++++++---
src/test/regress/sql/publication.sql | 98 ++++++++++++++++-
src/test/subscription/t/026_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 401 insertions(+), 31 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 3ffec3a..ae781cd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -219,10 +221,175 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident;
+} rf_context;
+
+/*
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
*/
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,10 +482,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6959675..fdf7659 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,19 +355,31 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -387,6 +401,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 40198fc..c7160bd 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,13 +184,23 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -208,6 +220,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8e..0d5682b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3497,6 +3497,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v42-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v42-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From d77fa3d2a25c3e5f14a4250af9d35e8fa3ec793d Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 25 Nov 2021 12:05:52 +1100
Subject: [PATCH v42] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 35 ++--
src/backend/replication/pgoutput/pgoutput.c | 244 ++++++++++++++++++++++++++--
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/026_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 262 insertions(+), 35 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index a2a0ce9..2c14e31 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -134,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -169,10 +172,16 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -736,18 +745,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = entry->scantuple->tts_tupleDescriptor;
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter_virtual(relation, old_slot, entry);
+ new_matched = pgoutput_row_filter_virtual(relation, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -777,7 +880,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
TupleDesc tupdesc = RelationGetDescr(relation);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
+ * Create tuple table slots for row filter. TupleDesc must live as
* long as the cache remains. Release the tuple table slot if it
* already exists.
*/
@@ -786,9 +889,28 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
+ if (entry->old_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->old_tuple);
+ entry->old_tuple = NULL;
+ }
+ if (entry->new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->new_tuple);
+ entry->new_tuple = NULL;
+ }
+ if (entry->tmp_new_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->tmp_new_tuple);
+ entry->tmp_new_tuple = NULL;
+ }
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
MemoryContextSwitchTo(oldctx);
/*
@@ -869,6 +991,67 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * The row is passed in as a virtual slot.
+ *
+ */
+static bool
+pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(entry->relid)),
+ get_rel_name(entry->relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = slot;
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -900,7 +1083,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -958,6 +1140,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -966,7 +1151,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, relentry))
break;
/*
@@ -997,9 +1182,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1022,8 +1208,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1033,7 +1238,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1451,6 +1656,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0d5682b..4496fe4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2194,6 +2194,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v42-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v42-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From ff5ae14d2d9b8358ab27d2ea5430b9b5329cab50 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 25 Nov 2021 12:40:30 +1100
Subject: [PATCH v42] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 13 ++++++++++++-
3 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5a2094d..3696ad2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4264,6 +4264,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4274,9 +4275,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4285,6 +4293,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4325,6 +4334,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4395,8 +4408,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index fa2e195..132b61f 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2768,10 +2776,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v42-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v42-0001-Row-filter-for-logical-replication.patchDownload
From 882de4c6f8dafe5ca43ecce9d2b74d2d5f70d826 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 24 Nov 2021 16:17:59 +1100
Subject: [PATCH v42] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause must contain only columns that are covered by REPLICA
IDENTITY, or are part of the primary key (when REPLICA IDENTITY is not set),
otherwise DELETE or UPDATE operations will not be replicated. That's because
old row is used and it only contains primary key or columns that are part of
the REPLICA IDENTITY; the remaining columns are NULL. For INSERT operations any
column might be used in the WHERE clause. If the row filter evaluates to NULL,
it returns false. For simplicity, functions are not allowed; this could be
addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining mupltiple row-filters
===============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 34 ++-
doc/src/sgml/ref/create_subscription.sgml | 17 ++
src/backend/catalog/pg_publication.c | 48 +++-
src/backend/commands/publicationcmds.c | 80 +++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 348 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 3 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++
src/test/regress/sql/publication.sql | 75 ++++++
src/test/subscription/t/026_row_filter.pl | 357 ++++++++++++++++++++++++++++
23 files changed, 1312 insertions(+), 51 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/026_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 4aeb0c8..851f48c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,22 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause must contain only columns that are
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> or <command>UPDATE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ operations any column might be used in the <literal>WHERE</literal> clause.
+ New row is used and it contains all columns. A <literal>NULL</literal> value
+ causes the expression to evaluate to false; avoid using columns without
+ not-null constraints in the <literal>WHERE</literal> clause. The
+ <literal>WHERE</literal> clause does not allow functions or user-defined
+ operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +260,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +278,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..42bf8c2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,10 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -319,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2..3ffec3a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -257,18 +260,22 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +296,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +333,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +355,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..c9ad079 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,61 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
-
- /* Calculate which relations to drop. */
+ /*
+ * Remove tables that are not found in the new table list and those
+ * tables which have a qual expression. The qual expression could be
+ * in the old table list or in the new table list.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
+
+ /*
+ * If the new relation or the old relation has a where clause,
+ * we need to remove it so that it can be added afresh later.
+ */
+ if (RelationGetRelid(newpubrel->relation) == oldrelid &&
+ newpubrel->whereClause == NULL && rfisnull)
{
found = true;
break;
}
}
- /* Not yet in the list, open it and add to the list */
+
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +920,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +948,30 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1004,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1013,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1033,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1039,10 +1081,11 @@ PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
{
PublicationRelInfo *pub_rel = (PublicationRelInfo *) lfirst(lc);
Relation rel = pub_rel->relation;
+ Oid relid = RelationGetRelid(rel);
ObjectAddress obj;
/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(relid, GetUserId()))
aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
RelationGetRelationName(rel));
@@ -1088,6 +1131,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 7d55fd6..4bcd77c 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4831,6 +4831,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index a6d0cef..4d0616b
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9652,12 +9652,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9672,28 +9673,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17341,7 +17359,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17354,6 +17373,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..af73b14 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..a2a0ce9 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1,4 +1,4 @@
-/*-------------------------------------------------------------------------
+/*------------------------------------------------------------------------
*
* pgoutput.c
* Logical Replication output plugin
@@ -13,18 +13,28 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +126,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +158,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +167,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +649,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ ReleaseSysCache(rftuple);
+
+ if (rfisnull)
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ break;
+ }
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +934,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +958,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +965,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +998,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1032,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1101,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1423,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1447,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1556,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1662,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1688,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1698,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1718,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..bd0d4ce 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -86,6 +86,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +123,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558..6959675 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435..40198fc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v42-0005-Topup-cache-the-result-of-row-filter-column-validation.patchapplication/octet-stream; name=v42-0005-Topup-cache-the-result-of-row-filter-column-validation.patchDownload
From 336ea6007d4c07108d2161cea44ea31f6aa87b45 Mon Sep 17 00:00:00 2001
From: "houzj.fnst" <houzj.fnst@fujitsu.com>
Date: Fri, 26 Nov 2021 10:07:55 +0800
Subject: [PATCH] cache the result of row filter column validation
For publish mode "delete" "update", validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.
Temporarily reserved the function GetRelationPublicationActions because it's a
public function.
---
src/backend/catalog/pg_publication.c | 131 ++++++---------------
src/backend/executor/execReplication.c | 29 ++++-
src/backend/utils/cache/relcache.c | 134 +++++++++++++++++-----
src/include/catalog/pg_publication.h | 13 +++
src/include/utils/rel.h | 3 +-
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 72 +++++++-----
src/test/regress/sql/publication.sql | 39 +++++--
8 files changed, 254 insertions(+), 168 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index ae781cd7e0..4bcfdc548c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -221,12 +221,29 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/* For rowfilter_walker. */
-typedef struct {
- Relation rel;
- bool check_replident; /* check if Var is bms_replident member? */
- Bitmapset *bms_replident;
-} rf_context;
+/*
+ * Check if all the columns used in the row-filter WHERE clause are part of
+ * REPLICA IDENTITY
+ */
+bool
+check_rowfilter_replident(Node *node, Bitmapset *bms_replident)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ bms_replident))
+ return true;
+ }
+
+ return expression_tree_walker(node, check_rowfilter_replident,
+ (void *) bms_replident);
+}
/*
* The row filter walker checks that the row filter expression is legal.
@@ -242,15 +259,9 @@ typedef struct {
* - User-defined functions are not allowed.
* - System functions that are not IMMUTABLE are not allowed.
* - NULLIF is allowed.
- *
- * Rules: Replica Identity validation
- * -----------------------------------
- * If the flag context.check_replident is true then validate that every variable
- * referenced by the filter expression is a valid member of the allowed set of
- * replica identity columns (context.bms_replindent)
*/
static bool
-rowfilter_walker(Node *node, rf_context *context)
+rowfilter_walker(Node *node, Relation relation)
{
char *forbidden = NULL;
bool too_complex = false;
@@ -258,29 +269,7 @@ rowfilter_walker(Node *node, rf_context *context)
if (node == NULL)
return false;
- if (IsA(node, Var))
- {
- /* Optionally, do replica identify validation of the referenced column. */
- if (context->check_replident)
- {
- Oid relid = RelationGetRelid(context->rel);
- Var *var = (Var *) node;
- AttrNumber attnum = var->varattno;
-
- if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
- {
- const char *colname = get_attname(relid, attnum, false);
-
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot add relation \"%s\" to publication",
- RelationGetRelationName(context->rel)),
- errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
- colname)));
- }
- }
- }
- else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
+ else if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
{
/* OK */
}
@@ -320,74 +309,18 @@ rowfilter_walker(Node *node, rf_context *context)
if (too_complex)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errhint("only simple expressions using columns, constants and immutable system functions are allowed")
));
if (forbidden)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errdetail("%s", forbidden)
));
- return expression_tree_walker(node, rowfilter_walker, (void *)context);
-}
-
-/*
- * Check if the row-filter is valid according to the following rules:
- *
- * 1. Only certain simple node types are permitted in the expression. See
- * function rowfilter_walker for details.
- *
- * 2. If the publish operation contains "delete" or "update" then only columns
- * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
- * the row-filter WHERE clause.
- */
-static void
-rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
-{
- rf_context context = {0};
-
- context.rel = rel;
-
- /*
- * For "delete" or "update", check that filter cols are also valid replica
- * identity cols.
- */
- if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
- {
- char replica_identity = rel->rd_rel->relreplident;
-
- if (replica_identity == REPLICA_IDENTITY_FULL)
- {
- /*
- * FULL means all cols are in the REPLICA IDENTITY, so all cols are
- * allowed in the row-filter too.
- */
- }
- else
- {
- context.check_replident = true;
-
- /*
- * Find what are the cols that are part of the REPLICA IDENTITY.
- * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
- */
- if (replica_identity == REPLICA_IDENTITY_DEFAULT)
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
- else
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
- }
- }
-
- /*
- * Walk the parse-tree of this publication row filter expression and throw an
- * error if anything not permitted or unexpected is encountered.
- */
- rowfilter_walker(rfnode, &context);
-
- bms_free(context.bms_replident);
+ return expression_tree_walker(node, rowfilter_walker, (void *) relation);
}
List *
@@ -487,8 +420,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
- /* Validate the row-filter. */
- rowfilter_expr_checker(pub, targetrel, whereclause);
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d27fd..c917466e7e 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -567,12 +567,34 @@ ExecSimpleRelationDelete(ResultRelInfo *resultRelInfo,
void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
- PublicationActions *pubactions;
+ PublicationInfo *pubinfo;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ pubinfo = RelationGetPublicationInfo(rel);
+
+ /*
+ * if not all columns in the publication row filter are part of the REPLICA
+ * IDENTITY, then it's unsafe to execute it for UPDATE and DELETE.
+ */
+ if (!pubinfo->rfcol_valid_for_replid)
+ {
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Not all row filter columns are not part of the REPLICA IDENTITY")));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Not all row filter columns are not part of the REPLICA IDENTITY")));
+ }
+
/* If relation has replica identity we are always good. */
if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
OidIsValid(RelationGetReplicaIndex(rel)))
@@ -583,14 +605,13 @@ CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
*
* Check if the table publishes UPDATES or DELETES.
*/
- pubactions = GetRelationPublicationActions(rel);
- if (cmd == CMD_UPDATE && pubactions->pubupdate)
+ if (cmd == CMD_UPDATE && pubinfo->pubactions.pubupdate)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot update table \"%s\" because it does not have a replica identity and publishes updates",
RelationGetRelationName(rel)),
errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.")));
- else if (cmd == CMD_DELETE && pubactions->pubdelete)
+ else if (cmd == CMD_DELETE && pubinfo->pubactions.pubdelete)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot delete from table \"%s\" because it does not have a replica identity and publishes deletes",
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 9fa9e671a1..542da49fe2 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -2432,8 +2433,8 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
- if (relation->rd_pubactions)
- pfree(relation->rd_pubactions);
+ if (relation->rd_pubinfo)
+ pfree(relation->rd_pubinfo);
if (relation->rd_options)
pfree(relation->rd_options);
if (relation->rd_indextuple)
@@ -5547,22 +5548,45 @@ RelationGetExclusionInfo(Relation indexRelation,
struct PublicationActions *
GetRelationPublicationActions(Relation relation)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ PublicationInfo *pubinfo;
+ PublicationActions *pubactions = palloc0(sizeof(PublicationInfo));
+
+ pubinfo = RelationGetPublicationInfo(relation);
+
+ pubactions = memcpy(pubactions, relation->rd_pubinfo,
+ sizeof(PublicationActions));
+
+ pfree(pubinfo);
+
+ return pubactions;
+}
+
+
+
+/*
+ * Get publication information for the given relation.
+ */
+struct PublicationInfo *
+RelationGetPublicationInfo(Relation relation)
+{
+ List *puboids;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ Bitmapset *bms_replident = NULL;
+ PublicationInfo *pubinfo = palloc0(sizeof(PublicationInfo));
+
+ pubinfo->rfcol_valid_for_replid = true;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
if (!is_publishable_relation(relation))
- return pubactions;
+ return pubinfo;
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (relation->rd_pubinfo)
+ return memcpy(pubinfo, relation->rd_pubinfo, sizeof(PublicationInfo));
/* Fetch the publication membership info. */
puboids = GetRelationPublications(RelationGetRelid(relation));
@@ -5586,12 +5610,25 @@ GetRelationPublicationActions(Relation relation)
GetSchemaPublications(schemaid));
}
}
+
puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5601,35 +5638,80 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubinfo->pubactions.pubinsert |= pubform->pubinsert;
+ pubinfo->pubactions.pubupdate |= pubform->pubupdate;
+ pubinfo->pubactions.pubdelete |= pubform->pubdelete;
+ pubinfo->pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UDDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ */
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL)
+ {
+ HeapTuple rftuple;
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(RelationGetRelid(relation)),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ if (check_rowfilter_replident(rfnode, bms_replident))
+ {
+ pubinfo->rfcol_valid_for_replid = false;
+ ReleaseSysCache(rftuple);
+ break;
+ }
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if (pubinfo->pubactions.pubinsert && pubinfo->pubactions.pubupdate &&
+ pubinfo->pubactions.pubdelete && pubinfo->pubactions.pubtruncate &&
+ !pubinfo->rfcol_valid_for_replid)
break;
}
- if (relation->rd_pubactions)
+ bms_free(bms_replident);
+
+ if (relation->rd_pubinfo)
{
- pfree(relation->rd_pubactions);
- relation->rd_pubactions = NULL;
+ pfree(relation->rd_pubinfo);
+ relation->rd_pubinfo = NULL;
}
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
- relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ relation->rd_pubinfo = palloc(sizeof(PublicationInfo));
+ memcpy(relation->rd_pubinfo, pubinfo, sizeof(PublicationInfo));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return pubinfo;
}
/*
@@ -6184,7 +6266,7 @@ load_relcache_init_file(bool shared)
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
- rel->rd_pubactions = NULL;
+ rel->rd_pubinfo = NULL;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bd0d4cec05..f698049633 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -74,6 +74,18 @@ typedef struct PublicationActions
bool pubtruncate;
} PublicationActions;
+typedef struct PublicationInfo
+{
+ PublicationActions pubactions;
+
+ /*
+ * True if pubactions don't include UPDATE and DELETE or
+ * all the columns in the row filter expression are part
+ * of replica identity.
+ */
+ bool rfcol_valid_for_replid;
+} PublicationInfo;
+
typedef struct Publication
{
Oid oid;
@@ -131,5 +143,6 @@ extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern bool check_rowfilter_replident(Node *node, Bitmapset *bms_replident);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b4faa1c123..57cebde965 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -160,7 +160,8 @@ typedef struct RelationData
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
- PublicationActions *rd_pubactions; /* publication actions */
+ /* data managed by RelationGetPublicationInfo: */
+ PublicationInfo *rd_pubinfo; /* publication information */
/*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index aa060ef115..54f9825ebb 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern struct PublicationInfo *RelationGetPublicationInfo(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index fdf7659e82..d9e42e5924 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -416,21 +416,27 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "d" is not part of the PK
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
@@ -444,21 +450,29 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+ERROR: publication "testpub6" already exists
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
@@ -466,21 +480,23 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index c7160bd457..8f6fd65057 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -237,15 +237,21 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "d" is not part of the PK
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
@@ -261,15 +267,22 @@ CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
@@ -278,17 +291,19 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
--
2.18.4
On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Nov 25, 2021, at 10:39 AM, houzj.fnst@fujitsu.com wrote:
When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B, publication A
only publish "insert" , publication B publish "update". When UPDATE, both row
filter in A and B will be executed. Is this behavior expected?Good question. No. The code should check the action before combining the
multiple row filters.
Do you mean to say that we should give an error on Update/Delete if
any of the publications contain table rowfilter that has columns that
are not part of the primary key or replica identity? I think this is
what Hou-san has implemented in his top-up patch and I also think this
is the right behavior.
--
With Regards,
Amit Kapila.
On Fri, Nov 26, 2021 at 1:16 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
Based on this direction, I tried to write a top up POC patch(0005) which I'd like to share.
I noticed a minor issue.
In the top-up patch, the following error message detail:
+ errdetail("Not all row filter columns are not part of the REPLICA
IDENTITY")));
should be:
+ errdetail("Not all row filter columns are part of the REPLICA IDENTITY")));
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Nov 26, 2021 11:32 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Nov 25, 2021, at 10:39 AM, houzj.fnst@fujitsu.com wrote:
When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B,
publication A only publish "insert" , publication B publish "update".
When UPDATE, both row filter in A and B will be executed. Is this behaviorexpected?
Good question. No. The code should check the action before combining
the multiple row filters.Do you mean to say that we should give an error on Update/Delete if any of the
publications contain table rowfilter that has columns that are not part of the
primary key or replica identity? I think this is what Hou-san has implemented in
his top-up patch and I also think this is the right behavior.
Yes, the top-up patch will give an error if the columns in row filter are not part of
replica identity when UPDATE and DELETE.
But the point I want to confirm is that:
---
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
---
When UPDATE on the table 'tbl1', is it correct to combine and execute both of
the row filter in A(b<2) and B(a>1) ?(it's the current behavior)
Because the filter in A has an unlogged column(b) and the publication A only
publish "insert", so for UPDATE, should we skip the row filter in A and only
execute the row filter in B ?
Best regards,
Hou zj
On Fri, Nov 26, 2021 at 4:05 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Fri, Nov 26, 2021 11:32 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Nov 25, 2021, at 10:39 AM, houzj.fnst@fujitsu.com wrote:
When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B,
publication A only publish "insert" , publication B publish "update".
When UPDATE, both row filter in A and B will be executed. Is this behaviorexpected?
Good question. No. The code should check the action before combining
the multiple row filters.Do you mean to say that we should give an error on Update/Delete if any of the
publications contain table rowfilter that has columns that are not part of the
primary key or replica identity? I think this is what Hou-san has implemented in
his top-up patch and I also think this is the right behavior.Yes, the top-up patch will give an error if the columns in row filter are not part of
replica identity when UPDATE and DELETE.But the point I want to confirm is that:
---
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
---When UPDATE on the table 'tbl1', is it correct to combine and execute both of
the row filter in A(b<2) and B(a>1) ?(it's the current behavior)Because the filter in A has an unlogged column(b) and the publication A only
publish "insert", so for UPDATE, should we skip the row filter in A and only
execute the row filter in B ?
But since the filters are OR'ed together does it even matter?
Now that your top-up patch now prevents invalid updates/deletes, this
other point is only really a question about the cache performance,
isn't it?
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Fri, Nov 26, 2021 at 4:18 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 26, 2021 at 4:05 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Fri, Nov 26, 2021 11:32 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Nov 25, 2021, at 10:39 AM, houzj.fnst@fujitsu.com wrote:
When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B,
publication A only publish "insert" , publication B publish "update".
When UPDATE, both row filter in A and B will be executed. Is this behaviorexpected?
Good question. No. The code should check the action before combining
the multiple row filters.Do you mean to say that we should give an error on Update/Delete if any of the
publications contain table rowfilter that has columns that are not part of the
primary key or replica identity? I think this is what Hou-san has implemented in
his top-up patch and I also think this is the right behavior.Yes, the top-up patch will give an error if the columns in row filter are not part of
replica identity when UPDATE and DELETE.But the point I want to confirm is that:
---
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
---When UPDATE on the table 'tbl1', is it correct to combine and execute both of
the row filter in A(b<2) and B(a>1) ?(it's the current behavior)Because the filter in A has an unlogged column(b) and the publication A only
publish "insert", so for UPDATE, should we skip the row filter in A and only
execute the row filter in B ?But since the filters are OR'ed together does it even matter?
Now that your top-up patch now prevents invalid updates/deletes, this
other point is only really a question about the cache performance,
isn't it?
Irrespective of replica identity I think there is still a functional
behaviour question, right?
e.g.
create publication p1 for table census where (country = 'Aust') with
(publish="update")
create publication p2 for table census where (country = 'NZ') with
(publish='insert')
Should it be possible to UPDATE for country 'NZ' or not?
Is this the same as your question Hou-san?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Nov 26, 2021 at 12:01 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 26, 2021 at 4:18 PM Peter Smith <smithpb2250@gmail.com> wrote:
Do you mean to say that we should give an error on Update/Delete if any of the
publications contain table rowfilter that has columns that are not part of the
primary key or replica identity? I think this is what Hou-san has implemented in
his top-up patch and I also think this is the right behavior.Yes, the top-up patch will give an error if the columns in row filter are not part of
replica identity when UPDATE and DELETE.But the point I want to confirm is that:
Okay, I see your point now.
---
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
---When UPDATE on the table 'tbl1', is it correct to combine and execute both of
the row filter in A(b<2) and B(a>1) ?(it's the current behavior)Because the filter in A has an unlogged column(b) and the publication A only
publish "insert", so for UPDATE, should we skip the row filter in A and only
execute the row filter in B ?But since the filters are OR'ed together does it even matter?
Even if it is OR'ed, if the value is not logged (as it was not part of
replica identity or primary key) as per Hou-San's example, how will
evaluate such a filter?
Now that your top-up patch now prevents invalid updates/deletes, this
other point is only really a question about the cache performance,
isn't it?Irrespective of replica identity I think there is still a functional
behaviour question, right?e.g.
create publication p1 for table census where (country = 'Aust') with
(publish="update")
create publication p2 for table census where (country = 'NZ') with
(publish='insert')Should it be possible to UPDATE for country 'NZ' or not?
Is this the same as your question Hou-san?
I am not sure if it is the same because in Hou-San's example
publications refer to different columns where one of the columns was
part of PK and another was not whereas in your example both refer to
the same column. I think in your example the error will happen at the
time of update/delete whereas in Hou-San's example it won't happen at
the time of update/delete.
With Regards,
Amit Kapila.
On Fri, Nov 26, 2021 at 1:16 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
...
Based on this direction, I tried to write a top up POC patch(0005) which I'd like to share.
The top up patch mainly did the following things.
* Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.* Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.Also attach the v42 patch set to keep cfbot happy.
Hi Hou-san.
Thanks for providing your "top-up" 0005 patch!
I suppose the goal will be to later merge this top-up with the current
0002 validation patch, but in the meantime here are my review comments
for 0005.
======
1) src/include/catalog/pg_publication.h - PublicationInfo
+typedef struct PublicationInfo
+{
+ PublicationActions pubactions;
+
+ /*
+ * True if pubactions don't include UPDATE and DELETE or
+ * all the columns in the row filter expression are part
+ * of replica identity.
+ */
+ bool rfcol_valid_for_replid;
+} PublicationInfo;
+
IMO "PublicationInfo" sounded too much like it is about the
Publication only, but IIUC it is really *per* Relation publication
info, right? So I thought perhaps it should be called more like struct
"RelationPubInfo".
======
2) src/include/catalog/pg_publication.h - PublicationInfo
The member "rfcol_valid_for_replid" also seems a little bit mis-named
because in some scenario (not UPDATE/DELETE) it can be true even if
there is not replica identity columns. So I thought perhaps it should
be called more like just "rfcols_valid"
Another thing - IIUC this is a kind of a "unified" boolean that covers
*all* filters for this Relation (across multiple publications). If
that is right., then the comment for this member should say something
about this.
======
3) src/include/catalog/pg_publication.h - PublicationInfo
This new typedef should be added to src/tools/pgindent/typedefs.list
======
4) src/backend/catalog/pg_publication.c - check_rowfilter_replident
+/*
+ * Check if all the columns used in the row-filter WHERE clause are part of
+ * REPLICA IDENTITY
+ */
+bool
+check_rowfilter_replident(Node *node, Bitmapset *bms_replident)
+{
IIUC here the false means "valid" and true means "invalid" which is
counter-intuitive to me. So at least true/false meaning ought to be
clarified in the function comment, and/or perhaps also rename the
function so that the return meaning is more obvious.
======
5) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
+ pubinfo = RelationGetPublicationInfo(rel);
+
IIUC this pubinfo* is palloced *every* time by
RelationGetPublicationInfo isn't it? If that is the case shouldn't
CheckCmdReplicaIdentity be doing a pfree(pubinfo)?
======
6) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
+ pubinfo = RelationGetPublicationInfo(rel);
+
+ /*
+ * if not all columns in the publication row filter are part of the REPLICA
+ * IDENTITY, then it's unsafe to execute it for UPDATE and DELETE.
+ */
+ if (!pubinfo->rfcol_valid_for_replid)
+ {
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Not all row filter columns are not part of the REPLICA
IDENTITY")));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Not all row filter columns are not part of the REPLICA
IDENTITY")));
The comment seemed worded in a confusingly negative way.
Before:
+ * if not all columns in the publication row filter are part of the REPLICA
+ * IDENTITY, then it's unsafe to execute it for UPDATE and DELETE.
My Suggestion:
It is only safe to execute UPDATE/DELETE when all columns of the
publication row filters are part of the REPLICA IDENTITY.
~~
Also, is "publication row filter" really the correct terminology?
AFAIK it is more like *all* filters for this Relation across multiple
publications, but I have not got a good idea how to word that in a
comment. Anyway, I have a feeling this whole idea might be impacted by
other discussions in this RF thread.
======
7) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
Error messages have double negative wording? I think Greg already
commented on this same point.
+ errdetail("Not all row filter columns are not part of the REPLICA
IDENTITY")));
======
8) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
But which are the bad filter columns?
Previously the Row Filter column validation gave errors for the
invalid filter column, but in this top-up patch there is no indication
which column or which filter or which publication was the bad one -
only that "something" bad was detected. IMO this might make it very
difficult for the user to know enough about the cause of the problem
to be able to fix the offending filter.
======
9) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
/* If relation has replica identity we are always good. */
if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
OidIsValid(RelationGetReplicaIndex(rel)))
I was wondering if the check for REPLICA_IDENTITY_FULL should go
*before* your new call to pubinfo = RelationGetPublicationInfo(rel);
because IIUC if *every* column is a member of the replica identity
then the filter validation is not really necessary at all.
======
10) src/backend/utils/cache/relcache.c - function
GetRelationPublicationActions
@@ -5547,22 +5548,45 @@ RelationGetExclusionInfo(Relation indexRelation,
struct PublicationActions *
GetRelationPublicationActions(Relation relation)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ PublicationInfo *pubinfo;
+ PublicationActions *pubactions = palloc0(sizeof(PublicationInfo));
+
+ pubinfo = RelationGetPublicationInfo(relation);
Just assign pubinfo at the declaration instead of later in the function body.
======
11) src/backend/utils/cache/relcache.c - function
GetRelationPublicationActions
+ pubactions = memcpy(pubactions, relation->rd_pubinfo,
+ sizeof(PublicationActions));
Isn't that memcpy slightly incorrect and only working because the
pubactions happens to be the first member of the PublicationInfo? I
thought it should really be copying from
"&relation->rd_pubinfo->pubactions", right?
======
12) src/backend/utils/cache/relcache.c - function
GetRelationPublicationActions
Excessive blank lines following this function.
======
13). src/backend/utils/cache/relcache.c - function RelationGetPublicationInfo
+/*
+ * Get publication information for the given relation.
+ */
+struct PublicationInfo *
+RelationGetPublicationInfo(Relation relation)
+{
+ List *puboids;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ Bitmapset *bms_replident = NULL;
+ PublicationInfo *pubinfo = palloc0(sizeof(PublicationInfo));
+
+ pubinfo->rfcol_valid_for_replid = true;
It is not entirely clear to me why this function is always pallocing
the PublicationInfo and then returning a copy of what is stored in the
relation->rd_pubinfo. This then puts a burden on the callers (like the
GetRelationPublicationActions etc) to make sure to free that memory.
Why can't we just return the relation->rd_pubinfo directly And avoid
all the extra palloc/memcpy/free?
======
14). src/backend/utils/cache/relcache.c - function RelationGetPublicationInfo
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
typo "IDENTIY" -> "IDENTITY"
======
15). src/backend/utils/cache/relcache.c - function RelationGetPublicationInfo
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
- relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ relation->rd_pubinfo = palloc(sizeof(PublicationInfo));
+ memcpy(relation->rd_pubinfo, pubinfo, sizeof(PublicationInfo));
MemoryContextSwitchTo(oldcxt);
The code comment looks a bit stale now. e.g. Perhaps now it should say
"save a copy of the info" instead of "save a copy of the actions".
======
16) Tests... CREATE PUBLICATION succeeds
I have not yet reviewed any of the 0005 tests, but there was some big
behaviour difference that I noticed.
I think now with the 0005 top-up patch the replica identify validation
is deferred to when UPDATE/DELETE is executed. I don’t know if this
will be very user friendly. It means now sometimes you can
successfully CREATE a PUBLICATION even though it will fail as soon as
you try to use it.
e.g. Below I create a publication with only pubaction "update", and
although it creates OK you cannot use it as intended.
test_pub=# create table t1(a int, b int, c int);
CREATE TABLE
test_pub=# create publication ptest for table t1 where (a > 3) with
(publish="update");
CREATE PUBLICATION
test_pub=# update t1 set a = 3;
ERROR: cannot update table "t1"
DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
Should we *also* be validating the replica identity at the time of
CREATE PUBLICATION so the user can be for-warned of problems?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Sun, Nov 28, 2021 at 6:17 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 26, 2021 at 1:16 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:...
Based on this direction, I tried to write a top up POC patch(0005) which I'd like to share.
The top up patch mainly did the following things.
* Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.* Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.Also attach the v42 patch set to keep cfbot happy.
Now I looked at the patch 0005 test cases. Since this patch does the
RI validation at UPDATE/DELETE execution instead of at the time of
CREATE PUBLICATION it means that currently, the CREATE PUBLICATION is
always going to succeed. So IIUC I think it is accidentally missing a
DROP PUBLICATION for one of the tests because the "ERROR: publication
"testpub6" already exists" should not be happening. Below is a
fragment from the regression test publication.out I am referring to:
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
UPDATE rf_tbl_abcd_pk set a = 1;
ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
ERROR: publication "testpub6" already exists
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Sun, Nov 28, 2021 3:18 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 26, 2021 at 1:16 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
...
Based on this direction, I tried to write a top up POC patch(0005) which I'd
like to share.The top up patch mainly did the following things.
* Move the row filter columns invalidation to CheckCmdReplicaIdentity, so
that the invalidation is executed only when actual UPDATE or DELETE executed on
the published relation. It's consistent with the existing check about replica
identity.* Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.Also attach the v42 patch set to keep cfbot happy.
Hi Hou-san.
Thanks for providing your "top-up" 0005 patch!
I suppose the goal will be to later merge this top-up with the current
0002 validation patch, but in the meantime here are my review comments
for 0005.
Thanks for the review and many valuable comments !
8) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
But which are the bad filter columns?
Previously the Row Filter column validation gave errors for the
invalid filter column, but in this top-up patch there is no indication
which column or which filter or which publication was the bad one -
only that "something" bad was detected. IMO this might make it very
difficult for the user to know enough about the cause of the problem
to be able to fix the offending filter.
If we want to report the invalid filter column, I can see two possibilities.
1) Instead of a bool flag, we cache a AttrNumber flag which indicates the
invalid column number(0 means all valid). We can report it in the error
message.
2) Everytime we decide to report an error, we traverse all the publications to
find the invalid column again and report it.
What do you think ?
13). src/backend/utils/cache/relcache.c - function RelationGetPublicationInfo +/* + * Get publication information for the given relation. + */ +struct PublicationInfo * +RelationGetPublicationInfo(Relation relation) +{ + List *puboids; + ListCell *lc; + MemoryContext oldcxt; + Oid schemaid; + Bitmapset *bms_replident = NULL; + PublicationInfo *pubinfo = palloc0(sizeof(PublicationInfo)); + + pubinfo->rfcol_valid_for_replid = true;It is not entirely clear to me why this function is always pallocing
the PublicationInfo and then returning a copy of what is stored in the
relation->rd_pubinfo. This then puts a burden on the callers (like the
GetRelationPublicationActions etc) to make sure to free that memory.
Why can't we just return the relation->rd_pubinfo directly And avoid
all the extra palloc/memcpy/free?
Normally, I think only the cache management function should change the data in
relcache. Return relation->xx directly might have a risk that user could
change the data in relcache. So, the management function usually return a copy
of cache data so that user is free to change it without affecting the real
cache data.
16) Tests... CREATE PUBLICATION succeeds
I have not yet reviewed any of the 0005 tests, but there was some big
behaviour difference that I noticed.I think now with the 0005 top-up patch the replica identify validation
is deferred to when UPDATE/DELETE is executed. I don’t know if this
will be very user friendly. It means now sometimes you can
successfully CREATE a PUBLICATION even though it will fail as soon as
you try to use it.
I am not sure, the initial idea here is to make the check of replica identity
consistent.
Currently, if user create a publication which publish "update" but the relation
in the publication didn't mark as replica identity, then user can create the
publication successfully. but the later UPDATE will report an error.
Best regards,
Hou zj
On Mon, Nov 29, 2021 at 1:54 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Sun, Nov 28, 2021 3:18 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 26, 2021 at 1:16 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
...
Based on this direction, I tried to write a top up POC patch(0005) which I'd
like to share.The top up patch mainly did the following things.
* Move the row filter columns invalidation to CheckCmdReplicaIdentity, so
that the invalidation is executed only when actual UPDATE or DELETE executed on
the published relation. It's consistent with the existing check about replica
identity.* Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.Also attach the v42 patch set to keep cfbot happy.
Hi Hou-san.
Thanks for providing your "top-up" 0005 patch!
I suppose the goal will be to later merge this top-up with the current
0002 validation patch, but in the meantime here are my review comments
for 0005.Thanks for the review and many valuable comments !
8) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
But which are the bad filter columns?
Previously the Row Filter column validation gave errors for the
invalid filter column, but in this top-up patch there is no indication
which column or which filter or which publication was the bad one -
only that "something" bad was detected. IMO this might make it very
difficult for the user to know enough about the cause of the problem
to be able to fix the offending filter.If we want to report the invalid filter column, I can see two possibilities.
1) Instead of a bool flag, we cache a AttrNumber flag which indicates the
invalid column number(0 means all valid). We can report it in the error
message.2) Everytime we decide to report an error, we traverse all the publications to
find the invalid column again and report it.What do you think ?
Perhaps your idea #1 is good enough. At least if we provide just the
bad column name then the user can use psql \d+ to find all filter
publications that include that bad column. Maybe that can be a HINT
for the error message.
13). src/backend/utils/cache/relcache.c - function RelationGetPublicationInfo +/* + * Get publication information for the given relation. + */ +struct PublicationInfo * +RelationGetPublicationInfo(Relation relation) +{ + List *puboids; + ListCell *lc; + MemoryContext oldcxt; + Oid schemaid; + Bitmapset *bms_replident = NULL; + PublicationInfo *pubinfo = palloc0(sizeof(PublicationInfo)); + + pubinfo->rfcol_valid_for_replid = true;It is not entirely clear to me why this function is always pallocing
the PublicationInfo and then returning a copy of what is stored in the
relation->rd_pubinfo. This then puts a burden on the callers (like the
GetRelationPublicationActions etc) to make sure to free that memory.
Why can't we just return the relation->rd_pubinfo directly And avoid
all the extra palloc/memcpy/free?Normally, I think only the cache management function should change the data in
relcache. Return relation->xx directly might have a risk that user could
change the data in relcache. So, the management function usually return a copy
of cache data so that user is free to change it without affecting the real
cache data.
OK.
16) Tests... CREATE PUBLICATION succeeds
I have not yet reviewed any of the 0005 tests, but there was some big
behaviour difference that I noticed.I think now with the 0005 top-up patch the replica identify validation
is deferred to when UPDATE/DELETE is executed. I don’t know if this
will be very user friendly. It means now sometimes you can
successfully CREATE a PUBLICATION even though it will fail as soon as
you try to use it.I am not sure, the initial idea here is to make the check of replica identity
consistent.Currently, if user create a publication which publish "update" but the relation
in the publication didn't mark as replica identity, then user can create the
publication successfully. but the later UPDATE will report an error.
OK. I see there is a different perspective; I will leave this to see
what other people think.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Fri, Nov 26, 2021 at 12:40 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B, publication A
only publish "insert" , publication B publish "update". When UPDATE, both row
filter in A and B will be executed. Is this behavior expected?For example:
---- Publication
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');---- Subscription
create table tbl1 (a int primary key);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
port=10000' PUBLICATION A,B;---- Publication
update tbl1 set a = 2;The publication can be created, and when UPDATE, the rowfilter in A (b<2) will
also been executed but the column in it is not part of replica identity.
(I am not against this behavior just confirm)
There seems to be problems related to allowing the row filter to
include columns that are not part of the replica identity (in the case
of publish=insert).
In your example scenario, the tbl1 WHERE clause "(b < 2)" for
publication A, that publishes inserts only, causes a problem, because
column "b" is not part of the replica identity.
To see this, follow the simple example below:
(and note, for the Subscription, the provided tbl1 definition has an
error, it should also include the 2nd column "b int", same as in the
publisher)
---- Publisher:
INSERT INTO tbl1 VALUES (1,1);
UPDATE tbl1 SET a = 2;
Prior to the UPDATE above:
On pub side, tbl1 contains (1,1).
On sub side, tbl1 contains (1,1)
After the above UPDATE:
On pub side, tbl1 contains (2,1).
On sub side, tbl1 contains (1,1), (2,1)
So the UPDATE on the pub side has resulted in an INSERT of (2,1) on
the sub side.
This is because when (1,1) is UPDATEd to (2,1), it attempts to use the
"insert" filter "(b<2)" to determine whether the old value had been
inserted (published to subscriber), but finds there is no "b" value
(because it only uses RI cols for UPDATE) and so has to assume the old
tuple doesn't exist on the subscriber, hence the UPDATE ends up doing
an INSERT.
INow if the use of RI cols were enforced for the insert filter case,
we'd properly know the answer as to whether the old row value had been
published and it would have correctly performed an UPDATE instead of
an INSERT in this case.
Thoughts?
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Nov 29, 2021 at 12:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Nov 26, 2021 at 12:40 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B, publication A
only publish "insert" , publication B publish "update". When UPDATE, both row
filter in A and B will be executed. Is this behavior expected?For example:
---- Publication
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');---- Subscription
create table tbl1 (a int primary key);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
port=10000' PUBLICATION A,B;---- Publication
update tbl1 set a = 2;The publication can be created, and when UPDATE, the rowfilter in A (b<2) will
also been executed but the column in it is not part of replica identity.
(I am not against this behavior just confirm)There seems to be problems related to allowing the row filter to
include columns that are not part of the replica identity (in the case
of publish=insert).
In your example scenario, the tbl1 WHERE clause "(b < 2)" for
publication A, that publishes inserts only, causes a problem, because
column "b" is not part of the replica identity.
To see this, follow the simple example below:
(and note, for the Subscription, the provided tbl1 definition has an
error, it should also include the 2nd column "b int", same as in the
publisher)---- Publisher:
INSERT INTO tbl1 VALUES (1,1);
UPDATE tbl1 SET a = 2;Prior to the UPDATE above:
On pub side, tbl1 contains (1,1).
On sub side, tbl1 contains (1,1)After the above UPDATE:
On pub side, tbl1 contains (2,1).
On sub side, tbl1 contains (1,1), (2,1)So the UPDATE on the pub side has resulted in an INSERT of (2,1) on
the sub side.This is because when (1,1) is UPDATEd to (2,1), it attempts to use the
"insert" filter "(b<2)" to determine whether the old value had been
inserted (published to subscriber), but finds there is no "b" value
(because it only uses RI cols for UPDATE) and so has to assume the old
tuple doesn't exist on the subscriber, hence the UPDATE ends up doing
an INSERT.
INow if the use of RI cols were enforced for the insert filter case,
we'd properly know the answer as to whether the old row value had been
published and it would have correctly performed an UPDATE instead of
an INSERT in this case.
I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.
--
With Regards,
Amit Kapila.
On Mon, Nov 29, 2021 at 3:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
---- Publisher:
INSERT INTO tbl1 VALUES (1,1);
UPDATE tbl1 SET a = 2;Prior to the UPDATE above:
On pub side, tbl1 contains (1,1).
On sub side, tbl1 contains (1,1)After the above UPDATE:
On pub side, tbl1 contains (2,1).
On sub side, tbl1 contains (1,1), (2,1)So the UPDATE on the pub side has resulted in an INSERT of (2,1) on
the sub side.This is because when (1,1) is UPDATEd to (2,1), it attempts to use the
"insert" filter "(b<2)" to determine whether the old value had been
inserted (published to subscriber), but finds there is no "b" value
(because it only uses RI cols for UPDATE) and so has to assume the old
tuple doesn't exist on the subscriber, hence the UPDATE ends up doing
an INSERT.
INow if the use of RI cols were enforced for the insert filter case,
we'd properly know the answer as to whether the old row value had been
published and it would have correctly performed an UPDATE instead of
an INSERT in this case.I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.
I am just wondering that if we don't combine filter in the above case
then what data we will send to the subscriber if the operation is
"UPDATE tbl1 SET a = 2, b=3", so in this case, we will apply only the
update filter i.e. a > 1 so as per that this will become the INSERT
operation because the old row was not passing the filter. So now we
will insert a new row in the subscriber-side with value (2,3). Looks
a bit odd to me that the value b=3 would have been rejected with the
direct insert but it is allowed due to indirect insert done by update.
Is this behavior looks odd only to me?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Nov 29, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Sun, Nov 28, 2021 3:18 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 26, 2021 at 1:16 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
...
Based on this direction, I tried to write a top up POC patch(0005) which I'd
like to share.The top up patch mainly did the following things.
* Move the row filter columns invalidation to CheckCmdReplicaIdentity, so
that the invalidation is executed only when actual UPDATE or DELETE executed on
the published relation. It's consistent with the existing check about replica
identity.* Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.Also attach the v42 patch set to keep cfbot happy.
Hi Hou-san.
Thanks for providing your "top-up" 0005 patch!
I suppose the goal will be to later merge this top-up with the current
0002 validation patch, but in the meantime here are my review comments
for 0005.Thanks for the review and many valuable comments !
8) src/backend/executor/execReplication.c - CheckCmdReplicaIdentity
But which are the bad filter columns?
Previously the Row Filter column validation gave errors for the
invalid filter column, but in this top-up patch there is no indication
which column or which filter or which publication was the bad one -
only that "something" bad was detected. IMO this might make it very
difficult for the user to know enough about the cause of the problem
to be able to fix the offending filter.If we want to report the invalid filter column, I can see two possibilities.
1) Instead of a bool flag, we cache a AttrNumber flag which indicates the
invalid column number(0 means all valid). We can report it in the error
message.2) Everytime we decide to report an error, we traverse all the publications to
find the invalid column again and report it.What do you think ?
I think we can probably give an error inside
RelationGetPublicationInfo(we can change the name of the function
based on changed functionality). Basically, if the row_filter is valid
then we can copy publication info from relcache and return it in
beginning, otherwise, allow it to check publications again. In error
cases, it shouldn't matter much to not use the cached information.
This is to some extent how the other parameters like rd_fkeyvalid and
rd_partcheckvalid works. One more thing, similar to some of the other
things isn't it better to manage pubactions and new bool flag directly
in relation instead of using PublicationInfo?
16) Tests... CREATE PUBLICATION succeeds
I have not yet reviewed any of the 0005 tests, but there was some big
behaviour difference that I noticed.I think now with the 0005 top-up patch the replica identify validation
is deferred to when UPDATE/DELETE is executed. I don’t know if this
will be very user friendly. It means now sometimes you can
successfully CREATE a PUBLICATION even though it will fail as soon as
you try to use it.I am not sure, the initial idea here is to make the check of replica identity
consistent.Currently, if user create a publication which publish "update" but the relation
in the publication didn't mark as replica identity, then user can create the
publication successfully. but the later UPDATE will report an error.
Yeah, I think giving an error on Update/Delete should be okay.
--
With Regards,
Amit Kapila.
On Sun, Nov 28, 2021 at 12:48 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 26, 2021 at 1:16 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:4) src/backend/catalog/pg_publication.c - check_rowfilter_replident +/* + * Check if all the columns used in the row-filter WHERE clause are part of + * REPLICA IDENTITY + */ +bool +check_rowfilter_replident(Node *node, Bitmapset *bms_replident) +{IIUC here the false means "valid" and true means "invalid" which is
counter-intuitive to me. So at least true/false meaning ought to be
clarified in the function comment, and/or perhaps also rename the
function so that the return meaning is more obvious.
+1 to rename the function in this case.
--
With Regards,
Amit Kapila.
On Mon, Nov 29, 2021 at 4:36 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Nov 29, 2021 at 3:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
---- Publisher:
INSERT INTO tbl1 VALUES (1,1);
UPDATE tbl1 SET a = 2;Prior to the UPDATE above:
On pub side, tbl1 contains (1,1).
On sub side, tbl1 contains (1,1)After the above UPDATE:
On pub side, tbl1 contains (2,1).
On sub side, tbl1 contains (1,1), (2,1)So the UPDATE on the pub side has resulted in an INSERT of (2,1) on
the sub side.This is because when (1,1) is UPDATEd to (2,1), it attempts to use the
"insert" filter "(b<2)" to determine whether the old value had been
inserted (published to subscriber), but finds there is no "b" value
(because it only uses RI cols for UPDATE) and so has to assume the old
tuple doesn't exist on the subscriber, hence the UPDATE ends up doing
an INSERT.
INow if the use of RI cols were enforced for the insert filter case,
we'd properly know the answer as to whether the old row value had been
published and it would have correctly performed an UPDATE instead of
an INSERT in this case.I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.I am just wondering that if we don't combine filter in the above case
then what data we will send to the subscriber if the operation is
"UPDATE tbl1 SET a = 2, b=3", so in this case, we will apply only the
update filter i.e. a > 1 so as per that this will become the INSERT
operation because the old row was not passing the filter.
If we want, I think for inserts (new row) we can consider the insert
filter as well but that makes it tricky to explain. I feel we can
change it later as well if there is a valid use case for this. What do
you think?
--
With Regards,
Amit Kapila.
On Mon, Nov 29, 2021 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.I am just wondering that if we don't combine filter in the above case
then what data we will send to the subscriber if the operation is
"UPDATE tbl1 SET a = 2, b=3", so in this case, we will apply only the
update filter i.e. a > 1 so as per that this will become the INSERT
operation because the old row was not passing the filter.If we want, I think for inserts (new row) we can consider the insert
filter as well but that makes it tricky to explain. I feel we can
change it later as well if there is a valid use case for this. What do
you think?
Yeah, that makes sense.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Nov 29, 2021, at 7:11 AM, Amit Kapila wrote:
I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.
That's exactly what I meant to say but apparently I didn't explain in details.
If a subscriber has multiple publications and a table is part of these
publications with different row filters, it should check the publication action
*before* including it in the row filter list. It means that an UPDATE operation
cannot apply a row filter that is part of a publication that has only INSERT as
an action. Having said that we cannot always combine multiple row filter
expressions into one. Instead, it should cache individual row filter expression
and apply the OR during the row filter execution (as I did in the initial
patches before this caching stuff). The other idea is to have multiple caches
for each action. The main disadvantage of this approach is to create 4x
entries.
I'm experimenting the first approach that stores multiple row filters and its
publication action right now. Unfortunately we cannot use the
relentry->pubactions because it aggregates this information if you have
multiple entries. It seems a separate array should store this information that
will be used later while evaluating the row filter -- around
pgoutput_row_filter_exec_expr() call.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Nov 29, 2021 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 12:10 PM Greg Nancarrow <gregn4422@gmail.com>
wrote:On Fri, Nov 26, 2021 at 12:40 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.It's possible the table is published in two publications A and B,
publication A only publish "insert" , publication B publish
"update". When UPDATE, both row filter in A and B will be executed. Is thisbehavior expected?
For example:
---- Publication
create table tbl1 (a int primary key, b int); create publication A
for table tbl1 where (b<2) with(publish='insert'); create
publication B for table tbl1 where (a>1) with(publish='update');---- Subscription
create table tbl1 (a int primary key); CREATE SUBSCRIPTION sub
CONNECTION 'dbname=postgres host=localhost port=10000'PUBLICATION
A,B;
---- Publication
update tbl1 set a = 2;The publication can be created, and when UPDATE, the rowfilter in A
(b<2) will also been executed but the column in it is not part of replicaidentity.
(I am not against this behavior just confirm)
There seems to be problems related to allowing the row filter to
include columns that are not part of the replica identity (in the case
of publish=insert).
In your example scenario, the tbl1 WHERE clause "(b < 2)" for
publication A, that publishes inserts only, causes a problem, because
column "b" is not part of the replica identity.
To see this, follow the simple example below:
(and note, for the Subscription, the provided tbl1 definition has an
error, it should also include the 2nd column "b int", same as in the
publisher)---- Publisher:
INSERT INTO tbl1 VALUES (1,1);
UPDATE tbl1 SET a = 2;Prior to the UPDATE above:
On pub side, tbl1 contains (1,1).
On sub side, tbl1 contains (1,1)After the above UPDATE:
On pub side, tbl1 contains (2,1).
On sub side, tbl1 contains (1,1), (2,1)So the UPDATE on the pub side has resulted in an INSERT of (2,1) on
the sub side.This is because when (1,1) is UPDATEd to (2,1), it attempts to use the
"insert" filter "(b<2)" to determine whether the old value had been
inserted (published to subscriber), but finds there is no "b" value
(because it only uses RI cols for UPDATE) and so has to assume the old
tuple doesn't exist on the subscriber, hence the UPDATE ends up doing
an INSERT.
INow if the use of RI cols were enforced for the insert filter case,
we'd properly know the answer as to whether the old row value had been
published and it would have correctly performed an UPDATE instead of
an INSERT in this case.I don't think it is a good idea to combine the row-filter from the publication
that publishes just 'insert' with the row-filter that publishes 'updates'. We
shouldn't apply the 'insert' filter for 'update' and similarly for publication
operations. We can combine the filters when the published operations are the
same. So, this means that we might need to cache multiple row-filters but I
think that is better than having another restriction that publish operation
'insert'
should also honor RI columns restriction.
Personally, I agreed that an UPDATE operation should only apply a row filter that
is part of a publication that has only UPDATE.
Best regards,
Hou zj
On Thursday, November 25, 2021 11:22 AM Peter Smith <smithpb2250@gmail.com> wrote:
Thanks for all the review comments so far! We are endeavouring to keep
pace with them.All feedback is being tracked and we will fix and/or reply to everything ASAP.
Meanwhile, PSA the latest set of v42* patches.
This version was mostly a patch restructuring exercise but it also
addresses some minor review comments in passing.
Thanks for your patch.
I have two comments on the document in 0001 patch.
1.
+ New row is used and it contains all columns. A <literal>NULL</literal> value
+ causes the expression to evaluate to false; avoid using columns without
I don't quite understand this sentence 'A NULL value causes the expression to evaluate to false'.
The expression contains NULL value can also return true. Could you be more specific?
For example:
postgres=# select null or true;
?column?
----------
t
(1 row)
2.
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored.
If the subscriber is a PostgreSQL version before 15, it seems row filtering will
be ignored only when copying initial data, the later changes will not be ignored in row
filtering. Should we make it clear in document?
Regards,
Tang
On Mon, Nov 29, 2021 at 8:40 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Nov 29, 2021, at 7:11 AM, Amit Kapila wrote:
I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.That's exactly what I meant to say but apparently I didn't explain in details.
If a subscriber has multiple publications and a table is part of these
publications with different row filters, it should check the publication action
*before* including it in the row filter list. It means that an UPDATE operation
cannot apply a row filter that is part of a publication that has only INSERT as
an action. Having said that we cannot always combine multiple row filter
expressions into one. Instead, it should cache individual row filter expression
and apply the OR during the row filter execution (as I did in the initial
patches before this caching stuff). The other idea is to have multiple caches
for each action. The main disadvantage of this approach is to create 4x
entries.I'm experimenting the first approach that stores multiple row filters and its
publication action right now.
We can try that way but I think we should still be able to combine in
many cases like where all the operations are specified for
publications having the table or maybe pubactions are same. So, we
should not give up on those cases. We can do this new logic only when
we find that pubactions are different and probably store them as
independent expressions and corresponding pubactions for it at the
current location in the v42* patch (in pgoutput_row_filter). It is
okay to combine them at a later stage during execution when we can't
do it at the time of forming cache entry.
--
With Regards,
Amit Kapila.
On Tue, Nov 30, 2021 at 10:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 8:40 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Nov 29, 2021, at 7:11 AM, Amit Kapila wrote:
I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.That's exactly what I meant to say but apparently I didn't explain in details.
If a subscriber has multiple publications and a table is part of these
publications with different row filters, it should check the publication action
*before* including it in the row filter list. It means that an UPDATE operation
cannot apply a row filter that is part of a publication that has only INSERT as
an action. Having said that we cannot always combine multiple row filter
expressions into one. Instead, it should cache individual row filter expression
and apply the OR during the row filter execution (as I did in the initial
patches before this caching stuff). The other idea is to have multiple caches
for each action. The main disadvantage of this approach is to create 4x
entries.I'm experimenting the first approach that stores multiple row filters and its
publication action right now.We can try that way but I think we should still be able to combine in
many cases like where all the operations are specified for
publications having the table or maybe pubactions are same. So, we
should not give up on those cases. We can do this new logic only when
we find that pubactions are different and probably store them as
independent expressions and corresponding pubactions for it at the
current location in the v42* patch (in pgoutput_row_filter). It is
okay to combine them at a later stage during execution when we can't
do it at the time of forming cache entry.
What about the initial table sync? during that, we are going to
combine all the filters or we are going to apply only the insert
filters?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Nov 25, 2021 at 2:22 PM Peter Smith <smithpb2250@gmail.com> wrote:
Thanks for all the review comments so far! We are endeavouring to keep
pace with them.All feedback is being tracked and we will fix and/or reply to everything ASAP.
Meanwhile, PSA the latest set of v42* patches.
This version was mostly a patch restructuring exercise but it also
addresses some minor review comments in passing.
Addressed more review comments, in the attached patch-set v43. 5
patches carried forward from v42.
This patch-set contains the following fixes:
On Tue, Nov 23, 2021 at 1:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
in pgoutput_row_filter, we are dropping the slots if there are some
old slots in the RelationSyncEntry. But then I noticed that in
rel_sync_cache_relation_cb(), also we are doing that but only for the
scantuple slot. So IMHO, rel_sync_cache_relation_cb(), is only place
setting entry->rowfilter_valid to false; so why not drop all the slot
that time only and in pgoutput_row_filter(), you can just put an
assert?
Moved all the dropping of slots to rel_sync_cache_relation_cb()
+static bool +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry) +{ + EState *estate; + ExprContext *ecxt;pgoutput_row_filter_virtual and pgoutput_row_filter are exactly same
except, ExecStoreHeapTuple(), so why not just put one check based on
whether a slot is passed or not, instead of making complete duplicate
copy of the function.
Removed pgoutput_row_filter_virtual
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);Why do we need to copy the tupledesc? do we think that we need to have
this slot even if we close the relation, if so can you add the
comments explaining why we are making a copy here.
This code has been modified, and comments added.
On Tue, Nov 23, 2021 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
One more thing related to this code: pgoutput_row_filter() { .. + if (!entry->rowfilter_valid) { .. + oldctx = MemoryContextSwitchTo(CacheMemoryContext); + tupdesc = CreateTupleDescCopy(tupdesc); + entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple); + MemoryContextSwitchTo(oldctx); .. }Why do we need to initialize scantuple here unless we are sure that
the row filter is going to get associated with this relentry? I think
when there is no row filter then this allocation is not required.
Modified as suggested.
On Tue, Nov 23, 2021 at 10:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
In 0003 patch, why is below change required? --- a/src/backend/replication/pgoutput/pgoutput.c +++ b/src/backend/replication/pgoutput/pgoutput.c @@ -1,4 +1,4 @@ -/*------------------------------------------------------------------------- +/*------------------------------------------------------------------------ * * pgoutput.c
Removed.
After above, rearrange the code in pgoutput_row_filter(), so that two
different checks related to 'rfisnull' (introduced by different
patches) can be combined as if .. else check.
Fixed.
On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
+ * If the new relation or the old relation has a where clause, + * we need to remove it so that it can be added afresh later. + */ + if (RelationGetRelid(newpubrel->relation) == oldrelid && + newpubrel->whereClause == NULL && rfisnull)Can't we use _equalPublicationTable() here? It compares the whereClause as well.
Tried this, can't do this because one is an alter statement while the
other is a publication, the whereclause is not
the same Nodetype. In the statement, the whereclause is T_A_Expr,
while in the publication
catalog, it is T_OpExpr.
/* Must be owner of the table or superuser. */ - if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId())) + if (!pg_class_ownercheck(relid, GetUserId()))Here, you can directly use RelationGetRelid as was used in the
previous code without using an additional variable.
Fixed.
2. +typedef struct { + Relation rel; + bool check_replident; + Bitmapset *bms_replident; +} +rf_context;Add rf_context in the same line where } ends.
Code has been modified, this comment no longer applies.
4. + * Rules: Node-type validation + * --------------------------- + * Allow only simple or compound expressions like: + * - "(Var Op Const)" orIt seems Var Op Var is allowed. I tried below and it works:
create publication pub for table t1 where (c1 < c2) WITH (publish = 'insert');I think it should be okay to allow it provided we ensure that we never
access some other table/view etc. as part of the expression. Also, we
should document the behavior correctly.
Fixed.
On Wed, Nov 24, 2021 at 8:52 PM vignesh C <vignesh21@gmail.com> wrote:
4) This should be included in typedefs.list, also we could add some comments for this structure +typedef struct { + Relation rel; + Bitmapset *bms_replident; +} +rf_context;
this has been removed in last patch, so comment no longer applies
5) Few includes are not required. #include "miscadmin.h" not required
in pg_publication.c, #include "executor/executor.h" not required in
proto.c, #include "access/xact.h", #include "executor/executor.h" and
#include "replication/logicalrelation.h" not required in pgoutput.c
Optimized this. removed "executor/executor.h" from patch 0003, removed
"access/xact.h" from patch 0001
removed "replication/logicalrelation.h” from 0001. Others required.
6) typo "filte" should be "filter": +/* + * The row filte walker checks that the row filter expression is legal. + * + * Rules: Node-type validation + * --------------------------- + * Allow only simple or compound expressions like: + * - "(Var Op Const)" or + * - "(Var Op Const) Bool (Var Op Const)"
Fixed.
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v43-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v43-0001-Row-filter-for-logical-replication.patchDownload
From f2d32078cdfdee778880702931148d49cd432777 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Mon, 29 Nov 2021 23:19:58 -0500
Subject: [PATCH v43 1/5] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause must contain only columns that are covered by REPLICA
IDENTITY, or are part of the primary key (when REPLICA IDENTITY is not set),
otherwise DELETE or UPDATE operations will not be replicated. That's because
old row is used and it only contains primary key or columns that are part of
the REPLICA IDENTITY; the remaining columns are NULL. For INSERT operations any
column might be used in the WHERE clause. If the row filter evaluates to NULL,
it returns false. For simplicity, functions are not allowed; this could be
addressed in a future patch.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining mupltiple row-filters
===============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 34 ++-
doc/src/sgml/ref/create_subscription.sgml | 17 ++
src/backend/catalog/pg_publication.c | 48 +++-
src/backend/commands/publicationcmds.c | 77 ++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 344 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 3 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++
src/test/regress/sql/publication.sql | 75 ++++++
src/test/subscription/t/026_row_filter.pl | 357 ++++++++++++++++++++++++++++
23 files changed, 1307 insertions(+), 49 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/026_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 4aeb0c8..851f48c 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,22 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The <literal>WHERE</literal> clause must contain only columns that are
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> or <command>UPDATE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ operations any column might be used in the <literal>WHERE</literal> clause.
+ New row is used and it contains all columns. A <literal>NULL</literal> value
+ causes the expression to evaluate to false; avoid using columns without
+ not-null constraints in the <literal>WHERE</literal> clause. The
+ <literal>WHERE</literal> clause does not allow functions or user-defined
+ operators.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +260,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +278,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..42bf8c2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,10 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -319,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2..3ffec3a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -257,18 +260,22 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +296,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +333,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +355,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..1c792ed 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,61 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
-
- /* Calculate which relations to drop. */
+ /*
+ * Remove tables that are not found in the new table list and those
+ * tables which have a qual expression. The qual expression could be
+ * in the old table list or in the new table list.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
+
+ /*
+ * If the new relation or the old relation has a where clause,
+ * we need to remove it so that it can be added afresh later.
+ */
+ if (RelationGetRelid(newpubrel->relation) == oldrelid &&
+ newpubrel->whereClause == NULL && rfisnull)
{
found = true;
break;
}
}
- /* Not yet in the list, open it and add to the list */
+
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +920,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +948,30 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1004,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1013,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1033,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1130,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee..be9c1fb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33b..8e96d54
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..af73b14 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..3b85915 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +156,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +165,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +647,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ ReleaseSysCache(rftuple);
+ }
+ else
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ ReleaseSysCache(rftuple);
+ break;
+ }
+
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +932,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +956,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +963,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +996,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1030,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1099,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1421,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1445,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1554,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1660,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1686,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1696,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1716,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..bd0d4ce 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -86,6 +86,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +123,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558..6959675 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435..40198fc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v43-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v43-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 43ed2ccc28a3522c91fc5c8fef34347e2d718d1b Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 30 Nov 2021 00:48:48 -0500
Subject: [PATCH v43 3/5] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 35 +++--
src/backend/replication/pgoutput/pgoutput.c | 194 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/026_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 209 insertions(+), 38 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3b85915..0ccffa7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -132,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +172,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +744,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -857,16 +961,34 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
*/
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -885,7 +1007,12 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -898,7 +1025,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -956,6 +1082,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -964,7 +1093,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -995,9 +1124,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1020,8 +1150,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1031,7 +1180,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1449,6 +1598,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v43-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v43-0002-PS-Row-filter-validation-walker.patchDownload
From 2ad3049f95c7034a4d7a1517891c87a41ff7584d Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 30 Nov 2021 00:42:09 -0500
Subject: [PATCH v43 2/5] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
This patch also disables (#if 0) all other row-filter errors which were thrown for
EXPR_KIND_PUBLICATION_WHERE in the 0001 patch.
---
src/backend/catalog/pg_publication.c | 179 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 5 +-
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 2 +
src/test/regress/expected/publication.out | 134 +++++++++++++++++++---
src/test/regress/sql/publication.sql | 98 +++++++++++++++-
src/test/subscription/t/026_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 404 insertions(+), 31 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 3ffec3a..4dc1f8a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -219,10 +221,178 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident;
+} rf_context;
+
+/*
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
+ */
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
*/
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -315,10 +485,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
whereclause = transformWhereClause(pstate,
copyObject(pri->whereClause),
EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION");
+ "PUBLICATION WHERE");
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..5e0c391 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not allowed in publication WHERE expressions");
else
err = _("grouping operations are not allowed in publication WHERE expressions");
-
+#endif
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +952,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("window functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..3519e62 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -201,6 +201,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
case T_FuncCall:
{
+#if 0
/*
* Forbid functions in publication WHERE condition
*/
@@ -209,6 +210,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("functions are not allowed in publication WHERE expressions"),
parser_errposition(pstate, exprLocation(expr))));
+#endif
result = transformFuncCall(pstate, (FuncCall *) expr);
break;
@@ -1777,7 +1779,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
err = _("cannot use subquery in publication WHERE expression");
+#endif
break;
/*
@@ -3100,7 +3104,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..212f473 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
+ pstate->p_hasTargetSRFs = true;
+#if 0
err = _("set-returning functions are not allowed in publication WHERE expressions");
+#endif
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..b3588df 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,12 +718,14 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+#if 0
/* Check it's not a custom operator for publication WHERE expressions */
if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user-defined operators are not allowed in publication WHERE expressions"),
parser_errposition(pstate, location)));
+#endif
/* Do typecasting and build the expression tree */
if (ltree == NULL)
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6959675..fdf7659 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,19 +355,31 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -387,6 +401,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 40198fc..c7160bd 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,13 +184,23 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -208,6 +220,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/026_row_filter.pl b/src/test/subscription/t/026_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/026_row_filter.pl
+++ b/src/test/subscription/t/026_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v43-0005-cache-the-result-of-row-filter-column-validation.patchapplication/octet-stream; name=v43-0005-cache-the-result-of-row-filter-column-validation.patchDownload
From 48925da3a58e773d3d7a769f41a0a0c6d5388fae Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 30 Nov 2021 00:56:15 -0500
Subject: [PATCH v43 5/5] cache the result of row filter column validation.
For publish mode "delete" "update", validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.
Temporarily reserved the function GetRelationPublicationActions because it's a
public function.
---
src/backend/catalog/pg_publication.c | 131 ++++++++---------------------
src/backend/executor/execReplication.c | 29 ++++++-
src/backend/utils/cache/relcache.c | 134 ++++++++++++++++++++++++------
src/include/catalog/pg_publication.h | 13 +++
src/include/utils/rel.h | 3 +-
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 72 +++++++++-------
src/test/regress/sql/publication.sql | 39 ++++++---
8 files changed, 254 insertions(+), 168 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 4dc1f8a..b80c21e 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -221,12 +221,29 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/* For rowfilter_walker. */
-typedef struct {
- Relation rel;
- bool check_replident; /* check if Var is bms_replident member? */
- Bitmapset *bms_replident;
-} rf_context;
+/*
+ * Check if all the columns used in the row-filter WHERE clause are part of
+ * REPLICA IDENTITY
+ */
+bool
+check_rowfilter_replident(Node *node, Bitmapset *bms_replident)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ bms_replident))
+ return true;
+ }
+
+ return expression_tree_walker(node, check_rowfilter_replident,
+ (void *) bms_replident);
+}
/*
* The row filter walker checks that the row filter expression is legal.
@@ -245,15 +262,9 @@ typedef struct {
* - User-defined functions are not allowed.
* - System functions that are not IMMUTABLE are not allowed.
* - NULLIF is allowed.
- *
- * Rules: Replica Identity validation
- * -----------------------------------
- * If the flag context.check_replident is true then validate that every variable
- * referenced by the filter expression is a valid member of the allowed set of
- * replica identity columns (context.bms_replindent)
*/
static bool
-rowfilter_walker(Node *node, rf_context *context)
+rowfilter_walker(Node *node, Relation relation)
{
char *forbidden = NULL;
bool too_complex = false;
@@ -261,29 +272,7 @@ rowfilter_walker(Node *node, rf_context *context)
if (node == NULL)
return false;
- if (IsA(node, Var))
- {
- /* Optionally, do replica identify validation of the referenced column. */
- if (context->check_replident)
- {
- Oid relid = RelationGetRelid(context->rel);
- Var *var = (Var *) node;
- AttrNumber attnum = var->varattno;
-
- if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
- {
- const char *colname = get_attname(relid, attnum, false);
-
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot add relation \"%s\" to publication",
- RelationGetRelationName(context->rel)),
- errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
- colname)));
- }
- }
- }
- else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
+ else if (IsA(node, Var) || IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr))
{
/* OK */
}
@@ -323,74 +312,18 @@ rowfilter_walker(Node *node, rf_context *context)
if (too_complex)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errhint("only simple expressions using columns, constants and immutable system functions are allowed")
));
if (forbidden)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errdetail("%s", forbidden)
));
- return expression_tree_walker(node, rowfilter_walker, (void *)context);
-}
-
-/*
- * Check if the row-filter is valid according to the following rules:
- *
- * 1. Only certain simple node types are permitted in the expression. See
- * function rowfilter_walker for details.
- *
- * 2. If the publish operation contains "delete" or "update" then only columns
- * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
- * the row-filter WHERE clause.
- */
-static void
-rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
-{
- rf_context context = {0};
-
- context.rel = rel;
-
- /*
- * For "delete" or "update", check that filter cols are also valid replica
- * identity cols.
- */
- if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
- {
- char replica_identity = rel->rd_rel->relreplident;
-
- if (replica_identity == REPLICA_IDENTITY_FULL)
- {
- /*
- * FULL means all cols are in the REPLICA IDENTITY, so all cols are
- * allowed in the row-filter too.
- */
- }
- else
- {
- context.check_replident = true;
-
- /*
- * Find what are the cols that are part of the REPLICA IDENTITY.
- * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
- */
- if (replica_identity == REPLICA_IDENTITY_DEFAULT)
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
- else
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
- }
- }
-
- /*
- * Walk the parse-tree of this publication row filter expression and throw an
- * error if anything not permitted or unexpected is encountered.
- */
- rowfilter_walker(rfnode, &context);
-
- bms_free(context.bms_replident);
+ return expression_tree_walker(node, rowfilter_walker, (void *) relation);
}
List *
@@ -490,8 +423,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
- /* Validate the row-filter. */
- rowfilter_expr_checker(pub, targetrel, whereclause);
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d2..c917466 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -567,12 +567,34 @@ ExecSimpleRelationDelete(ResultRelInfo *resultRelInfo,
void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
- PublicationActions *pubactions;
+ PublicationInfo *pubinfo;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ pubinfo = RelationGetPublicationInfo(rel);
+
+ /*
+ * if not all columns in the publication row filter are part of the REPLICA
+ * IDENTITY, then it's unsafe to execute it for UPDATE and DELETE.
+ */
+ if (!pubinfo->rfcol_valid_for_replid)
+ {
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Not all row filter columns are not part of the REPLICA IDENTITY")));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Not all row filter columns are not part of the REPLICA IDENTITY")));
+ }
+
/* If relation has replica identity we are always good. */
if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
OidIsValid(RelationGetReplicaIndex(rel)))
@@ -583,14 +605,13 @@ CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
*
* Check if the table publishes UPDATES or DELETES.
*/
- pubactions = GetRelationPublicationActions(rel);
- if (cmd == CMD_UPDATE && pubactions->pubupdate)
+ if (cmd == CMD_UPDATE && pubinfo->pubactions.pubupdate)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot update table \"%s\" because it does not have a replica identity and publishes updates",
RelationGetRelationName(rel)),
errhint("To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.")));
- else if (cmd == CMD_DELETE && pubactions->pubdelete)
+ else if (cmd == CMD_DELETE && pubinfo->pubactions.pubdelete)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot delete from table \"%s\" because it does not have a replica identity and publishes deletes",
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 9fa9e67..542da49 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -2432,8 +2433,8 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
- if (relation->rd_pubactions)
- pfree(relation->rd_pubactions);
+ if (relation->rd_pubinfo)
+ pfree(relation->rd_pubinfo);
if (relation->rd_options)
pfree(relation->rd_options);
if (relation->rd_indextuple)
@@ -5547,22 +5548,45 @@ RelationGetExclusionInfo(Relation indexRelation,
struct PublicationActions *
GetRelationPublicationActions(Relation relation)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ PublicationInfo *pubinfo;
+ PublicationActions *pubactions = palloc0(sizeof(PublicationInfo));
+
+ pubinfo = RelationGetPublicationInfo(relation);
+
+ pubactions = memcpy(pubactions, relation->rd_pubinfo,
+ sizeof(PublicationActions));
+
+ pfree(pubinfo);
+
+ return pubactions;
+}
+
+
+
+/*
+ * Get publication information for the given relation.
+ */
+struct PublicationInfo *
+RelationGetPublicationInfo(Relation relation)
+{
+ List *puboids;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ Bitmapset *bms_replident = NULL;
+ PublicationInfo *pubinfo = palloc0(sizeof(PublicationInfo));
+
+ pubinfo->rfcol_valid_for_replid = true;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
if (!is_publishable_relation(relation))
- return pubactions;
+ return pubinfo;
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (relation->rd_pubinfo)
+ return memcpy(pubinfo, relation->rd_pubinfo, sizeof(PublicationInfo));
/* Fetch the publication membership info. */
puboids = GetRelationPublications(RelationGetRelid(relation));
@@ -5586,12 +5610,25 @@ GetRelationPublicationActions(Relation relation)
GetSchemaPublications(schemaid));
}
}
+
puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5601,35 +5638,80 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubinfo->pubactions.pubinsert |= pubform->pubinsert;
+ pubinfo->pubactions.pubupdate |= pubform->pubupdate;
+ pubinfo->pubactions.pubdelete |= pubform->pubdelete;
+ pubinfo->pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UDDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ */
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL)
+ {
+ HeapTuple rftuple;
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(RelationGetRelid(relation)),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ if (check_rowfilter_replident(rfnode, bms_replident))
+ {
+ pubinfo->rfcol_valid_for_replid = false;
+ ReleaseSysCache(rftuple);
+ break;
+ }
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if (pubinfo->pubactions.pubinsert && pubinfo->pubactions.pubupdate &&
+ pubinfo->pubactions.pubdelete && pubinfo->pubactions.pubtruncate &&
+ !pubinfo->rfcol_valid_for_replid)
break;
}
- if (relation->rd_pubactions)
+ bms_free(bms_replident);
+
+ if (relation->rd_pubinfo)
{
- pfree(relation->rd_pubactions);
- relation->rd_pubactions = NULL;
+ pfree(relation->rd_pubinfo);
+ relation->rd_pubinfo = NULL;
}
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
- relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ relation->rd_pubinfo = palloc(sizeof(PublicationInfo));
+ memcpy(relation->rd_pubinfo, pubinfo, sizeof(PublicationInfo));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return pubinfo;
}
/*
@@ -6184,7 +6266,7 @@ load_relcache_init_file(bool shared)
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
- rel->rd_pubactions = NULL;
+ rel->rd_pubinfo = NULL;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bd0d4ce..f698049 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -74,6 +74,18 @@ typedef struct PublicationActions
bool pubtruncate;
} PublicationActions;
+typedef struct PublicationInfo
+{
+ PublicationActions pubactions;
+
+ /*
+ * True if pubactions don't include UPDATE and DELETE or
+ * all the columns in the row filter expression are part
+ * of replica identity.
+ */
+ bool rfcol_valid_for_replid;
+} PublicationInfo;
+
typedef struct Publication
{
Oid oid;
@@ -131,5 +143,6 @@ extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern bool check_rowfilter_replident(Node *node, Bitmapset *bms_replident);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b4faa1c..57cebde 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -160,7 +160,8 @@ typedef struct RelationData
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
- PublicationActions *rd_pubactions; /* publication actions */
+ /* data managed by RelationGetPublicationInfo: */
+ PublicationInfo *rd_pubinfo; /* publication information */
/*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index aa060ef..54f9825 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern struct PublicationInfo *RelationGetPublicationInfo(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index fdf7659..d9e42e5 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -416,21 +416,27 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "d" is not part of the PK
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
@@ -444,21 +450,29 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+ERROR: publication "testpub6" already exists
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
@@ -466,21 +480,23 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
-DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Not all row filter columns are not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index c7160bd..8f6fd65 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -237,15 +237,21 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "d" is not part of the PK
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
@@ -261,15 +267,22 @@ CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
@@ -278,17 +291,19 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
--
1.8.3.1
v43-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v43-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From 736ee031d1892438ad8da89d7536dbc0b55aa5af Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Tue, 30 Nov 2021 00:50:45 -0500
Subject: [PATCH v43 4/5] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 13 ++++++++++++-
3 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5a2094d..3696ad2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4264,6 +4264,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4274,9 +4275,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4285,6 +4293,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4325,6 +4334,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4395,8 +4408,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 630026d..ad5ea9b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2778,10 +2786,13 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
On Tue, Nov 30, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 30, 2021 at 10:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 8:40 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Nov 29, 2021, at 7:11 AM, Amit Kapila wrote:
I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.That's exactly what I meant to say but apparently I didn't explain in details.
If a subscriber has multiple publications and a table is part of these
publications with different row filters, it should check the publication action
*before* including it in the row filter list. It means that an UPDATE operation
cannot apply a row filter that is part of a publication that has only INSERT as
an action. Having said that we cannot always combine multiple row filter
expressions into one. Instead, it should cache individual row filter expression
and apply the OR during the row filter execution (as I did in the initial
patches before this caching stuff). The other idea is to have multiple caches
for each action. The main disadvantage of this approach is to create 4x
entries.I'm experimenting the first approach that stores multiple row filters and its
publication action right now.We can try that way but I think we should still be able to combine in
many cases like where all the operations are specified for
publications having the table or maybe pubactions are same. So, we
should not give up on those cases. We can do this new logic only when
we find that pubactions are different and probably store them as
independent expressions and corresponding pubactions for it at the
current location in the v42* patch (in pgoutput_row_filter). It is
okay to combine them at a later stage during execution when we can't
do it at the time of forming cache entry.What about the initial table sync? during that, we are going to
combine all the filters or we are going to apply only the insert
filters?
AFAIK, currently, initial table sync doesn't respect publication
actions so it should combine all the filters. What do you think?
--
With Regards,
Amit Kapila.
On Tue, Nov 30, 2021 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
We can try that way but I think we should still be able to combine in
many cases like where all the operations are specified for
publications having the table or maybe pubactions are same. So, we
should not give up on those cases. We can do this new logic only when
we find that pubactions are different and probably store them as
independent expressions and corresponding pubactions for it at the
current location in the v42* patch (in pgoutput_row_filter). It is
okay to combine them at a later stage during execution when we can't
do it at the time of forming cache entry.What about the initial table sync? during that, we are going to
combine all the filters or we are going to apply only the insert
filters?AFAIK, currently, initial table sync doesn't respect publication
actions so it should combine all the filters. What do you think?
Yeah, I have the same opinion.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Nov 30, 2021 at 12:33 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Thu, Nov 25, 2021 at 2:22 PM Peter Smith <smithpb2250@gmail.com> wrote:
Thanks for all the review comments so far! We are endeavouring to keep
pace with them.All feedback is being tracked and we will fix and/or reply to everything ASAP.
Meanwhile, PSA the latest set of v42* patches.
This version was mostly a patch restructuring exercise but it also
addresses some minor review comments in passing.Addressed more review comments, in the attached patch-set v43. 5
patches carried forward from v42.
This patch-set contains the following fixes:On Tue, Nov 23, 2021 at 1:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
in pgoutput_row_filter, we are dropping the slots if there are some
old slots in the RelationSyncEntry. But then I noticed that in
rel_sync_cache_relation_cb(), also we are doing that but only for the
scantuple slot. So IMHO, rel_sync_cache_relation_cb(), is only place
setting entry->rowfilter_valid to false; so why not drop all the slot
that time only and in pgoutput_row_filter(), you can just put an
assert?Moved all the dropping of slots to rel_sync_cache_relation_cb()
+static bool +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry) +{ + EState *estate; + ExprContext *ecxt;pgoutput_row_filter_virtual and pgoutput_row_filter are exactly same
except, ExecStoreHeapTuple(), so why not just put one check based on
whether a slot is passed or not, instead of making complete duplicate
copy of the function.Removed pgoutput_row_filter_virtual
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);Why do we need to copy the tupledesc? do we think that we need to have
this slot even if we close the relation, if so can you add the
comments explaining why we are making a copy here.This code has been modified, and comments added.
On Tue, Nov 23, 2021 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
One more thing related to this code: pgoutput_row_filter() { .. + if (!entry->rowfilter_valid) { .. + oldctx = MemoryContextSwitchTo(CacheMemoryContext); + tupdesc = CreateTupleDescCopy(tupdesc); + entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple); + MemoryContextSwitchTo(oldctx); .. }Why do we need to initialize scantuple here unless we are sure that
the row filter is going to get associated with this relentry? I think
when there is no row filter then this allocation is not required.Modified as suggested.
On Tue, Nov 23, 2021 at 10:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
In 0003 patch, why is below change required? --- a/src/backend/replication/pgoutput/pgoutput.c +++ b/src/backend/replication/pgoutput/pgoutput.c @@ -1,4 +1,4 @@ -/*------------------------------------------------------------------------- +/*------------------------------------------------------------------------ * * pgoutput.cRemoved.
After above, rearrange the code in pgoutput_row_filter(), so that two
different checks related to 'rfisnull' (introduced by different
patches) can be combined as if .. else check.Fixed.
On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
+ * If the new relation or the old relation has a where clause, + * we need to remove it so that it can be added afresh later. + */ + if (RelationGetRelid(newpubrel->relation) == oldrelid && + newpubrel->whereClause == NULL && rfisnull)Can't we use _equalPublicationTable() here? It compares the whereClause as well.
Tried this, can't do this because one is an alter statement while the
other is a publication, the whereclause is not
the same Nodetype. In the statement, the whereclause is T_A_Expr,
while in the publication
catalog, it is T_OpExpr./* Must be owner of the table or superuser. */ - if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId())) + if (!pg_class_ownercheck(relid, GetUserId()))Here, you can directly use RelationGetRelid as was used in the
previous code without using an additional variable.Fixed.
2. +typedef struct { + Relation rel; + bool check_replident; + Bitmapset *bms_replident; +} +rf_context;Add rf_context in the same line where } ends.
Code has been modified, this comment no longer applies.
4. + * Rules: Node-type validation + * --------------------------- + * Allow only simple or compound expressions like: + * - "(Var Op Const)" orIt seems Var Op Var is allowed. I tried below and it works:
create publication pub for table t1 where (c1 < c2) WITH (publish = 'insert');I think it should be okay to allow it provided we ensure that we never
access some other table/view etc. as part of the expression. Also, we
should document the behavior correctly.Fixed.
On Wed, Nov 24, 2021 at 8:52 PM vignesh C <vignesh21@gmail.com> wrote:
4) This should be included in typedefs.list, also we could add some comments for this structure +typedef struct { + Relation rel; + Bitmapset *bms_replident; +} +rf_context;this has been removed in last patch, so comment no longer applies
5) Few includes are not required. #include "miscadmin.h" not required
in pg_publication.c, #include "executor/executor.h" not required in
proto.c, #include "access/xact.h", #include "executor/executor.h" and
#include "replication/logicalrelation.h" not required in pgoutput.cOptimized this. removed "executor/executor.h" from patch 0003, removed
"access/xact.h" from patch 0001
removed "replication/logicalrelation.h” from 0001. Others required.6) typo "filte" should be "filter": +/* + * The row filte walker checks that the row filter expression is legal. + * + * Rules: Node-type validation + * --------------------------- + * Allow only simple or compound expressions like: + * - "(Var Op Const)" or + * - "(Var Op Const) Bool (Var Op Const)"Fixed.
Thanks for the updated patch, few comments:
1) Should this be changed to include non IMMUTABLE system functions
are not allowed:
+ not-null constraints in the <literal>WHERE</literal> clause. The
+ <literal>WHERE</literal> clause does not allow functions or user-defined
+ operators.
+ </para>
2) We can remove the #if 0 code if we don't plan to keep it in the final patch.
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState
*pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not
allowed in publication WHERE expressions");
else
err = _("grouping operations are not
allowed in publication WHERE expressions");
-
+#endif
3) Can a user remove the row filter without removing the table from
the publication after creating the publication or should the user drop
the table and add the table in this case?
4) Should this be changed, since we error out if publisher without
replica identify performs delete or update:
+ The <literal>WHERE</literal> clause must contain only columns that are
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> or <command>UPDATE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
to:
+ The <literal>WHERE</literal> clause must contain only columns that are
+ covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary
+ key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise
+ <command>DELETE</command> or <command>UPDATE</command> operations will be
+ disallowed on those tables. That's because old row is used and it
only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
Regards,
Vignesh
On Tue, Nov 30, 2021 at 12:33 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Thu, Nov 25, 2021 at 2:22 PM Peter Smith <smithpb2250@gmail.com> wrote:
Thanks for all the review comments so far! We are endeavouring to keep
pace with them.All feedback is being tracked and we will fix and/or reply to everything ASAP.
Meanwhile, PSA the latest set of v42* patches.
This version was mostly a patch restructuring exercise but it also
addresses some minor review comments in passing.Addressed more review comments, in the attached patch-set v43. 5
patches carried forward from v42.
This patch-set contains the following fixes:On Tue, Nov 23, 2021 at 1:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
in pgoutput_row_filter, we are dropping the slots if there are some
old slots in the RelationSyncEntry. But then I noticed that in
rel_sync_cache_relation_cb(), also we are doing that but only for the
scantuple slot. So IMHO, rel_sync_cache_relation_cb(), is only place
setting entry->rowfilter_valid to false; so why not drop all the slot
that time only and in pgoutput_row_filter(), you can just put an
assert?Moved all the dropping of slots to rel_sync_cache_relation_cb()
+static bool +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot, RelationSyncEntry *entry) +{ + EState *estate; + ExprContext *ecxt;pgoutput_row_filter_virtual and pgoutput_row_filter are exactly same
except, ExecStoreHeapTuple(), so why not just put one check based on
whether a slot is passed or not, instead of making complete duplicate
copy of the function.Removed pgoutput_row_filter_virtual
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);Why do we need to copy the tupledesc? do we think that we need to have
this slot even if we close the relation, if so can you add the
comments explaining why we are making a copy here.This code has been modified, and comments added.
On Tue, Nov 23, 2021 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
One more thing related to this code: pgoutput_row_filter() { .. + if (!entry->rowfilter_valid) { .. + oldctx = MemoryContextSwitchTo(CacheMemoryContext); + tupdesc = CreateTupleDescCopy(tupdesc); + entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple); + MemoryContextSwitchTo(oldctx); .. }Why do we need to initialize scantuple here unless we are sure that
the row filter is going to get associated with this relentry? I think
when there is no row filter then this allocation is not required.Modified as suggested.
On Tue, Nov 23, 2021 at 10:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
In 0003 patch, why is below change required? --- a/src/backend/replication/pgoutput/pgoutput.c +++ b/src/backend/replication/pgoutput/pgoutput.c @@ -1,4 +1,4 @@ -/*------------------------------------------------------------------------- +/*------------------------------------------------------------------------ * * pgoutput.cRemoved.
After above, rearrange the code in pgoutput_row_filter(), so that two
different checks related to 'rfisnull' (introduced by different
patches) can be combined as if .. else check.Fixed.
On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
+ * If the new relation or the old relation has a where clause, + * we need to remove it so that it can be added afresh later. + */ + if (RelationGetRelid(newpubrel->relation) == oldrelid && + newpubrel->whereClause == NULL && rfisnull)Can't we use _equalPublicationTable() here? It compares the whereClause as well.
Tried this, can't do this because one is an alter statement while the
other is a publication, the whereclause is not
the same Nodetype. In the statement, the whereclause is T_A_Expr,
while in the publication
catalog, it is T_OpExpr.
Here we will not be able to do a direct comparison as we store the
transformed where clause in the pg_publication_rel table. We will have
to transform the where clause and then check. I have attached a patch
where we can check the transformed where clause and see if the where
clause is the same or not. If you are ok with this approach you could
make similar changes.
Regards,
Vignesh
Attachments:
Alter_publication_set_table_where_clause_check.patchtext/x-patch; charset=US-ASCII; name=Alter_publication_set_table_where_clause_check.patchDownload
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index b80c21e6ae..ae4a46e44a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -359,6 +359,32 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
return result;
}
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *transformedwhereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ transformedwhereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (bfixupcollation)
+ assign_expr_collations(pstate, transformedwhereclause);
+
+ return transformedwhereclause;
+}
+
/*
* Insert new publication / relation mapping.
*/
@@ -377,7 +403,6 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
ObjectAddress myself,
referenced;
ParseState *pstate;
- ParseNamespaceItem *nsitem;
Node *whereclause = NULL;
List *relids = NIL;
@@ -408,20 +433,8 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
{
/* Set up a ParseState to parse with */
pstate = make_parsestate(NULL);
- pstate->p_sourcetext = nodeToString(pri->whereClause);
-
- nsitem = addRangeTableEntryForRelation(pstate, targetrel,
- AccessShareLock,
- NULL, false, false);
- addNSItemToQuery(pstate, nsitem, false, true, true);
-
- whereclause = transformWhereClause(pstate,
- copyObject(pri->whereClause),
- EXPR_KIND_PUBLICATION_WHERE,
- "PUBLICATION WHERE");
- /* Fix up collation information */
- assign_expr_collations(pstate, whereclause);
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
/*
* Walk the parse-tree of this publication row filter expression and
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 1c792ed9cb..6106ec25d4 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -497,6 +497,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ Node *oldrelwhereclause = NULL;
/*
* It is quite possible that for the SET case user has not specified any
@@ -554,8 +555,13 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
ObjectIdGetDatum(pubid));
if (HeapTupleIsValid(rftuple))
{
- SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
- &rfisnull);
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
ReleaseSysCache(rftuple);
}
@@ -569,11 +575,31 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
* If the new relation or the old relation has a where clause,
* we need to remove it so that it can be added afresh later.
*/
- if (RelationGetRelid(newpubrel->relation) == oldrelid &&
- newpubrel->whereClause == NULL && rfisnull)
+ if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index f698049633..8c63dd5f85 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -142,7 +143,9 @@ extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
-
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
extern bool check_rowfilter_replident(Node *node, Bitmapset *bms_replident);
#endif /* PG_PUBLICATION_H */
On Tue, Nov 30, 2021 at 9:34 PM vignesh C <vignesh21@gmail.com> wrote:
3) Can a user remove the row filter without removing the table from
the publication after creating the publication or should the user drop
the table and add the table in this case?
AFAIK to remove an existing filter use ALTER PUBLICATION ... SET TABLE
but do not specify any filter.
For example,
test_pub=# create table t1(a int primary key);
CREATE TABLE
test_pub=# create publication p1 for table t1 where (a > 1);
CREATE PUBLICATION
test_pub=# create publication p2 for table t1 where (a > 2);
CREATE PUBLICATION
test_pub=# \d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage |
Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | not null | | plain |
| |
Indexes:
"t1_pkey" PRIMARY KEY, btree (a)
Publications:
"p1" WHERE ((a > 1))
"p2" WHERE ((a > 2))
Access method: heap
test_pub=# alter publication p1 set table t1;
ALTER PUBLICATION
test_pub=# \d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage |
Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | not null | | plain |
| |
Indexes:
"t1_pkey" PRIMARY KEY, btree (a)
Publications:
"p1"
"p2" WHERE ((a > 2))
Access method: heap
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 30, 2021, at 7:25 AM, Amit Kapila wrote:
On Tue, Nov 30, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
What about the initial table sync? during that, we are going to
combine all the filters or we are going to apply only the insert
filters?AFAIK, currently, initial table sync doesn't respect publication
actions so it should combine all the filters. What do you think?
I agree. If you think that it might need a row to apply DML commands (UPDATE,
DELETE) in the future or that due to a row filter that row should be available
in the subscriber (INSERT-only case), it makes sense to send all rows that
satisfies any row filter.
The current code already works this way. All row filter are combined into a
WHERE clause using OR. If any of the publications don't have a row filter,
there is no WHERE clause.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Dec 1, 2021 at 6:55 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Nov 30, 2021, at 7:25 AM, Amit Kapila wrote:
On Tue, Nov 30, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
What about the initial table sync? during that, we are going to
combine all the filters or we are going to apply only the insert
filters?AFAIK, currently, initial table sync doesn't respect publication
actions so it should combine all the filters. What do you think?I agree. If you think that it might need a row to apply DML commands (UPDATE,
DELETE) in the future or that due to a row filter that row should be available
in the subscriber (INSERT-only case), it makes sense to send all rows that
satisfies any row filter.
Right and Good point.
--
With Regards,
Amit Kapila.
PSA the v44* set of patches.
The following review comments are addressed:
v44-0001 main patch
- Renamed the TAP test 026->027 due to clash caused by recent commit [1]https://github.com/postgres/postgres/commit/8d74fc96db5fd547e077bf9bf4c3b67f821d71cd [Tomas 23/9] /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com [Peter 18/11] /messages/by-id/CAFPTHDa67_H=sALy+EqXDGmUKm1MO-83apffZkO34RELjt_Prg@mail.gmail.com [Houz 23/11] /messages/by-id/OS0PR01MB57162EB465A0E6BCFDF9B3F394609@OS0PR01MB5716.jpnprd01.prod.outlook.com [Vignesh 23/11] /messages/by-id/CALDaNm2bq-Zab3i5pvuA3UTxHvo3BqPwmgXbyznpw5vz4=fxpA@mail.gmail.com [Amit 24/11] /messages/by-id/CAA4eK1+Xd=kM5D3jtXyN+W7J+wU-yyQAdyq66a6Wcq_PKRTbSw@mail.gmail.com [Tang 30/11] /messages/by-id/OS0PR01MB6113F2E024961A9C7F36BEADFB679@OS0PR01MB6113.jpnprd01.prod.outlook.com [Vignesh 30/11] /messages/by-id/CALDaNm2T3yXJkuKXARUUh+=_36Ry7gYxUqhpgW8AxECug9nH6Q@mail.gmail.com
- Refactored table_close [Houz 23/11] #2
- Alter compare where clauses [Amit 24/11] #0
- PG docs CREATE SUBSCRIPTION [Tang 30/11] #2
- PG docs CREATE PUBLICATION [Vignesh 30/11] #1, #4, [Tang 30/11] #1,
[Tomas 23/9] #2
v44-0002 validation walker
- Add NullTest support [Peter 18/11]
- Update comments [Amit 24/11] #3
- Disallow user-defined types [Amit 24/11] #4
- Errmsg - skipped because handled by top-up [Vignesh 23/11] #2
- Removed #if 0 [Vignesh 30/11] #2
v44-0003 new/old tuple
- NA
v44-0004 tab-complete and pgdump
- Handle table-list commas better [Vignesh 23/11] #2
v44-0005 top-up patch for validation
- (This patch will be added again later)
------
[1]: https://github.com/postgres/postgres/commit/8d74fc96db5fd547e077bf9bf4c3b67f821d71cd [Tomas 23/9] /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com [Peter 18/11] /messages/by-id/CAFPTHDa67_H=sALy+EqXDGmUKm1MO-83apffZkO34RELjt_Prg@mail.gmail.com [Houz 23/11] /messages/by-id/OS0PR01MB57162EB465A0E6BCFDF9B3F394609@OS0PR01MB5716.jpnprd01.prod.outlook.com [Vignesh 23/11] /messages/by-id/CALDaNm2bq-Zab3i5pvuA3UTxHvo3BqPwmgXbyznpw5vz4=fxpA@mail.gmail.com [Amit 24/11] /messages/by-id/CAA4eK1+Xd=kM5D3jtXyN+W7J+wU-yyQAdyq66a6Wcq_PKRTbSw@mail.gmail.com [Tang 30/11] /messages/by-id/OS0PR01MB6113F2E024961A9C7F36BEADFB679@OS0PR01MB6113.jpnprd01.prod.outlook.com [Vignesh 30/11] /messages/by-id/CALDaNm2T3yXJkuKXARUUh+=_36Ry7gYxUqhpgW8AxECug9nH6Q@mail.gmail.com
[Tomas 23/9] /messages/by-id/574b4e78-2f35-acf3-4bdc-4b872582e739@enterprisedb.com
[Peter 18/11] /messages/by-id/CAFPTHDa67_H=sALy+EqXDGmUKm1MO-83apffZkO34RELjt_Prg@mail.gmail.com
[Houz 23/11] /messages/by-id/OS0PR01MB57162EB465A0E6BCFDF9B3F394609@OS0PR01MB5716.jpnprd01.prod.outlook.com
[Vignesh 23/11]
/messages/by-id/CALDaNm2bq-Zab3i5pvuA3UTxHvo3BqPwmgXbyznpw5vz4=fxpA@mail.gmail.com
[Amit 24/11] /messages/by-id/CAA4eK1+Xd=kM5D3jtXyN+W7J+wU-yyQAdyq66a6Wcq_PKRTbSw@mail.gmail.com
[Tang 30/11] /messages/by-id/OS0PR01MB6113F2E024961A9C7F36BEADFB679@OS0PR01MB6113.jpnprd01.prod.outlook.com
[Vignesh 30/11]
/messages/by-id/CALDaNm2T3yXJkuKXARUUh+=_36Ry7gYxUqhpgW8AxECug9nH6Q@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v44-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v44-0001-Row-filter-for-logical-replication.patchDownload
From 2d4971ec6f9d37c13484fdd2eee3055bef4a69c1 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 18:50:01 +1100
Subject: [PATCH v44] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. The WHERE clause does not allow
user-defined functions / operators / types; it also does not allow built-in
functions unless they are immutable.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 28 ++-
doc/src/sgml/ref/create_subscription.sgml | 17 ++
src/backend/catalog/pg_publication.c | 62 ++++-
src/backend/commands/publicationcmds.c | 105 ++++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 344 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++
src/test/regress/sql/publication.sql | 75 ++++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++++++
23 files changed, 1347 insertions(+), 49 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 4aeb0c8..f06dd92 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,16 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>UPDATE</command> and/or <command>DELETE</command> operations must
+ contain only columns that are covered by <literal>REPLICA IDENTITY</literal>. The
+ row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>INSERT</command> can use any column. The <literal>WHERE</literal>
+ clause does not allow user-defined functions / operators / types; it also does not allow
+ built-in functions unless they are immutable.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +254,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +272,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..8453467 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,10 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -319,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored during data synchronization.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2..89d00cd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -253,22 +256,51 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
return result;
}
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *transformedwhereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ transformedwhereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (bfixupcollation)
+ assign_expr_collations(pstate, transformedwhereclause);
+
+ return transformedwhereclause;
+}
+
/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +321,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /* Fix up collation information */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +347,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +369,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..6373fa2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -497,6 +497,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ Node *oldrelwhereclause = NULL;
/*
* It is quite possible that for the SET case user has not specified any
@@ -529,40 +530,92 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
+
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +952,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +980,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1032,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1041,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1061,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1158,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee..be9c1fb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33b..8e96d54
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..af73b14 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..3b85915 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +156,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +165,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +647,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ ReleaseSysCache(rftuple);
+ }
+ else
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ ReleaseSysCache(rftuple);
+ break;
+ }
+
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +932,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +956,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +963,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +996,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1030,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1099,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1421,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1445,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1554,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1660,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1686,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1696,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1716,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..4a25222 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,13 +124,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558..6959675 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435..40198fc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v44-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v44-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 436125eee26d322a91f426ebc104b48def45b415 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 19:23:29 +1100
Subject: [PATCH v44] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 35 +++--
src/backend/replication/pgoutput/pgoutput.c | 194 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 209 insertions(+), 38 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3b85915..0ccffa7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -132,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +172,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +744,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -857,16 +961,34 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
*/
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -885,7 +1007,12 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -898,7 +1025,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -956,6 +1082,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -964,7 +1093,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -995,9 +1124,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1020,8 +1150,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1031,7 +1180,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1449,6 +1598,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v44-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v44-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From c0b7ed0a7c69165048bba05765b7ac67a41b24bd Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 19:37:06 +1100
Subject: [PATCH v44] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 15 +++++++++++++--
3 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5a2094d..3696ad2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4264,6 +4264,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4274,9 +4275,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4285,6 +4293,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4325,6 +4334,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4395,8 +4408,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 630026d..6b87365 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,11 +2785,14 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v44-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v44-0002-PS-Row-filter-validation-walker.patchDownload
From 45386efd4c73b38657b13cd35f93f4fe63570ed9 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 19:08:17 +1100
Subject: [PATCH v44] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr, NullTest
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
---
src/backend/catalog/pg_publication.c | 198 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 14 ++-
src/backend/parser/parse_expr.c | 22 ++--
src/backend/parser/parse_func.c | 6 +-
src/backend/parser/parse_oper.c | 7 --
src/test/regress/expected/publication.out | 144 +++++++++++++++++++---
src/test/regress/sql/publication.sql | 106 +++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 448 insertions(+), 57 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 89d00cd..d67023a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -219,10 +221,199 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident;
+} rf_context;
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ * - IS NULL is allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-defnition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because those
+ * (not immutable ones) can access database and would lead to the problem (b)
+ * mentioned in the previous paragraph.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
*/
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ forbidden = _("user-defined types are not allowed");
+
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
+ || IsA(node, NullTest))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -333,6 +524,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..f65a86f 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,10 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +950,10 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..d8627b9 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -1777,7 +1766,10 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
/*
@@ -3100,7 +3092,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..4e4557f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,11 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
+ pstate->p_hasTargetSRFs = true;
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6959675..d9ee9ff 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,19 +355,41 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish="insert");
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: user-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -387,6 +411,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 40198fc..fcc09b1 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,13 +184,31 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish="insert");
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -208,6 +228,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
On Thu, Sep 23, 2021 at 10:33 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
2) create_publication.sgml says:
A <literal>NULL</literal> value causes the expression to evaluate
to false; avoid using columns without not-null constraints in the
<literal>WHERE</literal> clause.That's not quite correct, I think - doesn't the expression evaluate to
NULL (which is not TRUE, so it counts as mismatch)?I suspect this whole paragraph (talking about NULL in old/new rows)
might be a bit too detailed / low-level for user docs.
Updated docs in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 23, 2021 at 5:27 PM vignesh C <vignesh21@gmail.com> wrote:
2) Since the error message is because it publishes delete/update operations, it should include publish delete/update in the error message. Can we change the error message: + if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident)) + { + const char *colname = get_attname(relid, attnum, false); + + ereport(ERROR, + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), + errmsg("cannot add relation \"%s\" to publication", + RelationGetRelationName(context->rel)), + errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY", + colname))); + }To something like:
ereport(ERROR,
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
errmsg("cannot add relation \"%s\" to publication because row filter
column \"%s\" does not have a replica identity and publishes
deletes/updates",
RelationGetRelationName(context->rel), colname),
errhint("To enable deleting/updating from the table, set REPLICA
IDENTITY using ALTER TABLE")));
The "top-up" patch 0005 (see v43*) is already addressing this now.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 23, 2021 at 6:59 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, Nov 18, 2021 at 7:04 AM Peter Smith <smithpb2250@gmail.com>
wrote:PSA new set of v40* patches.
Few comments:
...
Another comment about v40-0001 patch:
+ char *relname = pstrdup(RelationGetRelationName(rel)); + table_close(rel, ShareUpdateExclusiveLock); + + /* Disallow duplicate tables if there are any with row-filters. */ + if (t->whereClause || list_member_oid(relids_with_rf, myrelid)) + ereport(ERROR, + (errcode(ERRCODE_DUPLICATE_OBJECT), + errmsg("conflicting or redundant row-filters for \"%s\"", + relname))); + pfree(relname);Maybe we can do the error check before table_close(), so that we don't need to
invoke pstrdup() and pfree().
Fixed in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Nov 24, 2021 at 3:37 AM vignesh C <vignesh21@gmail.com> wrote:
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
Few comments on v41-0002 patch:
...
2) Tab completion completes with "WHERE (" in case of "alter publication pub1 add table t1,": + /* ALTER PUBLICATION <name> SET TABLE <name> */ + /* ALTER PUBLICATION <name> ADD TABLE <name> */ + else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny)) + COMPLETE_WITH("WHERE (");Should this be changed to: + /* ALTER PUBLICATION <name> SET TABLE <name> */ + /* ALTER PUBLICATION <name> ADD TABLE <name> */ + else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && (!ends_with(prev_wd, ',')) + COMPLETE_WITH("WHERE (");
Fixed in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
Attaching a new patchset v41 which includes changes by both Peter and myself.
Patches v40-0005 and v40-0006 have been merged to create patch
v41-0005 which reduces the patches to 6 again.
This patch-set contains changes addressing the following review comments:On Mon, Nov 15, 2021 at 5:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
What I meant was that with this new code we have regressed the old
behavior. Basically, imagine a case where no filter was given for any
of the tables. Then after the patch, we will remove all the old tables
whereas before the patch it will remove the oldrels only when they are
not specified as part of new rels. If you agree with this, then we can
retain the old behavior and for the new tables, we can always override
the where clause for a SET variant of command.Fixed and modified the behaviour to match with what the schema patch
implemented.+ + /* + * If the new relation or the old relation has a where clause, + * we need to remove it so that it can be added afresh later. + */ + if (RelationGetRelid(newpubrel->relation) == oldrelid && + newpubrel->whereClause == NULL && rfisnull)Can't we use _equalPublicationTable() here? It compares the whereClause as well.
Fixed in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Few more comments:
=================
0001
...
.
3. In the function header comment of rowfilter_walker, you mentioned
the simple expressions allowed but we should write why we are doing
so. It has been discussed in detail in various emails in this thread.
AFAIR, below are the reasons:
A. We don't want to allow user-defined functions or operators because
(a) if the user drops such a function/operator or if there is any
other error via that function, the walsender won't be able to recover
from such an error even if we fix the function's problem because it
uses a historic snapshot to access row-filter; (b) any other table
could be accessed via a function which won't work because of historic
snapshots in logical decoding environment.B. We don't allow anything other immutable built-in functions as those
can access database and would lead to the problem (b) mentioned in the
previous paragraph.
Updated comment in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Don't we need to check for user-defined types similar to user-defined
functions and operators? If not why?
Fixed in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 30, 2021 at 2:49 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Thursday, November 25, 2021 11:22 AM Peter Smith <smithpb2250@gmail.com> wrote:
Thanks for all the review comments so far! We are endeavouring to keep
pace with them.All feedback is being tracked and we will fix and/or reply to everything ASAP.
Meanwhile, PSA the latest set of v42* patches.
This version was mostly a patch restructuring exercise but it also
addresses some minor review comments in passing.Thanks for your patch.
I have two comments on the document in 0001 patch.1. + New row is used and it contains all columns. A <literal>NULL</literal> value + causes the expression to evaluate to false; avoid using columns withoutI don't quite understand this sentence 'A NULL value causes the expression to evaluate to false'.
The expression contains NULL value can also return true. Could you be more specific?For example:
postgres=# select null or true;
?column?
----------
t
(1 row)
Updated publication docs in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com.
2. + at all then all other filters become redundant. If the subscriber is a + <productname>PostgreSQL</productname> version before 15 then any row filtering + is ignored.If the subscriber is a PostgreSQL version before 15, it seems row filtering will
be ignored only when copying initial data, the later changes will not be ignored in row
filtering. Should we make it clear in document?
Updated subscription docs in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com.
------
[1]: /messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 30, 2021 at 9:34 PM vignesh C <vignesh21@gmail.com> wrote:
...
Thanks for the updated patch, few comments: 1) Should this be changed to include non IMMUTABLE system functions are not allowed: + not-null constraints in the <literal>WHERE</literal> clause. The + <literal>WHERE</literal> clause does not allow functions or user-defined + operators. + </para>
Updated docs in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
2) We can remove the #if 0 code if we don't plan to keep it in the final patch. --- a/src/backend/parser/parse_agg.c +++ b/src/backend/parser/parse_agg.c @@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)break;
case EXPR_KIND_PUBLICATION_WHERE:
+#if 0
if (isAgg)
err = _("aggregate functions are not
allowed in publication WHERE expressions");
else
err = _("grouping operations are not
allowed in publication WHERE expressions");
-
+#endif
Fixed in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
4) Should this be changed, since we error out if publisher without replica identify performs delete or update: + The <literal>WHERE</literal> clause must contain only columns that are + covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary + key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise + <command>DELETE</command> or <command>UPDATE</command> operations will not + be replicated. That's because old row is used and it only contains primary + key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the + remaining columns are <literal>NULL</literal>. For <command>INSERT</command>to: + The <literal>WHERE</literal> clause must contain only columns that are + covered by <literal>REPLICA IDENTITY</literal>, or are part of the primary + key (when <literal>REPLICA IDENTITY</literal> is not set), otherwise + <command>DELETE</command> or <command>UPDATE</command> operations will be + disallowed on those tables. That's because old row is used and it only contains primary + key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the + remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
Updated docs in v44 [1]/messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtjxzedJPbSZyb9pd72+UrGEj6HagQQbCdO0YJvr7OyJg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thursday, December 2, 2021 5:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v44* set of patches.
Thanks for the new patch. Few comments:
1. This is an example in publication doc, but in fact it's not allowed. Should we
change this example?
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
postgres=# CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
ERROR: invalid publication WHERE expression for relation "departments"
HINT: only simple expressions using columns, constants and immutable system functions are allowed
2. A typo in 0002 patch.
+ * drops such a user-defnition or if there is any other error via its function,
"user-defnition" should be "user-definition".
Regards,
Tang
On Thur, Dec 2, 2021 5:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v44* set of patches.
The following review comments are addressed:
v44-0001 main patch
- Renamed the TAP test 026->027 due to clash caused by recent commit [1]
- Refactored table_close [Houz 23/11] #2
- Alter compare where clauses [Amit 24/11] #0
- PG docs CREATE SUBSCRIPTION [Tang 30/11] #2
- PG docs CREATE PUBLICATION [Vignesh 30/11] #1, #4, [Tang 30/11] #1, [Tomas
23/9] #2v44-0002 validation walker
- Add NullTest support [Peter 18/11]
- Update comments [Amit 24/11] #3
- Disallow user-defined types [Amit 24/11] #4
- Errmsg - skipped because handled by top-up [Vignesh 23/11] #2
- Removed #if 0 [Vignesh 30/11] #2v44-0003 new/old tuple
- NAv44-0004 tab-complete and pgdump
- Handle table-list commas better [Vignesh 23/11] #2v44-0005 top-up patch for validation
- (This patch will be added again later)
Attach the v44-0005 top-up patch.
This version addressed all the comments received so far,
mainly including the following changes:
1) rename rfcol_valid_for_replica to rfcol_valid
2) Remove the struct PublicationInfo and add the rfcol_valid flag directly in relation
3) report the invalid column number in the error message.
4) Rename some function to match the usage.
5) Fix some typos and add some code comments.
6) Fix a miss in testcase.
Best regards,
Hou zj
Attachments:
v44-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v44-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From c0b7ed0a7c69165048bba05765b7ac67a41b24bd Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 19:37:06 +1100
Subject: [PATCH v44] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 15 +++++++++++++--
3 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5a2094d..3696ad2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4264,6 +4264,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4274,9 +4275,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4285,6 +4293,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4325,6 +4334,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4395,8 +4408,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 630026d..6b87365 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,11 +2785,14 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v44-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v44-0001-Row-filter-for-logical-replication.patchDownload
From 2d4971ec6f9d37c13484fdd2eee3055bef4a69c1 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 18:50:01 +1100
Subject: [PATCH v44] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. The WHERE clause does not allow
user-defined functions / operators / types; it also does not allow built-in
functions unless they are immutable.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 28 ++-
doc/src/sgml/ref/create_subscription.sgml | 17 ++
src/backend/catalog/pg_publication.c | 62 ++++-
src/backend/commands/publicationcmds.c | 105 ++++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 344 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++
src/test/regress/sql/publication.sql | 75 ++++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++++++
23 files changed, 1347 insertions(+), 49 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 4aeb0c8..f06dd92 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,16 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>UPDATE</command> and/or <command>DELETE</command> operations must
+ contain only columns that are covered by <literal>REPLICA IDENTITY</literal>. The
+ row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>INSERT</command> can use any column. The <literal>WHERE</literal>
+ clause does not allow user-defined functions / operators / types; it also does not allow
+ built-in functions unless they are immutable.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -240,6 +254,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -253,6 +272,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..8453467 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,10 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -319,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored during data synchronization.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2..89d00cd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -253,22 +256,51 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
return result;
}
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *transformedwhereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ transformedwhereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (bfixupcollation)
+ assign_expr_collations(pstate, transformedwhereclause);
+
+ return transformedwhereclause;
+}
+
/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +321,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /* Fix up collation information */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +347,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +369,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..6373fa2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -497,6 +497,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ Node *oldrelwhereclause = NULL;
/*
* It is quite possible that for the SET case user has not specified any
@@ -529,40 +530,92 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
+
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +952,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +980,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1032,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1041,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1061,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1158,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee..be9c1fb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33b..8e96d54
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..af73b14 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..3b85915 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +156,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +165,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +647,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ ReleaseSysCache(rftuple);
+ }
+ else
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ ReleaseSysCache(rftuple);
+ break;
+ }
+
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +932,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +956,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +963,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +996,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1030,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1099,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1421,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1445,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1554,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1660,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1686,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1696,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1716,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..4a25222 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,13 +124,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558..6959675 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435..40198fc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v44-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v44-0002-PS-Row-filter-validation-walker.patchDownload
From 45386efd4c73b38657b13cd35f93f4fe63570ed9 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 19:08:17 +1100
Subject: [PATCH v44] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr, NullTest
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
---
src/backend/catalog/pg_publication.c | 198 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 14 ++-
src/backend/parser/parse_expr.c | 22 ++--
src/backend/parser/parse_func.c | 6 +-
src/backend/parser/parse_oper.c | 7 --
src/test/regress/expected/publication.out | 144 +++++++++++++++++++---
src/test/regress/sql/publication.sql | 106 +++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 448 insertions(+), 57 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 89d00cd..d67023a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -219,10 +221,199 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident;
+} rf_context;
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ * - IS NULL is allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-defnition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because those
+ * (not immutable ones) can access database and would lead to the problem (b)
+ * mentioned in the previous paragraph.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
*/
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ forbidden = _("user-defined types are not allowed");
+
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
+ || IsA(node, NullTest))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -333,6 +524,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..f65a86f 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,10 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +950,10 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..d8627b9 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -1777,7 +1766,10 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
/*
@@ -3100,7 +3092,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..4e4557f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,11 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
+ pstate->p_hasTargetSRFs = true;
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6959675..d9ee9ff 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,19 +355,41 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish="insert");
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: user-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -387,6 +411,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 40198fc..fcc09b1 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,13 +184,31 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish="insert");
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -208,6 +228,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v44-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v44-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 436125eee26d322a91f426ebc104b48def45b415 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 1 Dec 2021 19:23:29 +1100
Subject: [PATCH v44] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 35 +++--
src/backend/replication/pgoutput/pgoutput.c | 194 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 209 insertions(+), 38 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3b85915..0ccffa7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -132,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +172,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +744,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -857,16 +961,34 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
*/
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -885,7 +1007,12 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -898,7 +1025,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -956,6 +1082,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -964,7 +1093,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -995,9 +1124,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1020,8 +1150,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1031,7 +1180,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1449,6 +1598,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v44-0005-cache-the-result-of-row-filter-column-validation.patchapplication/octet-stream; name=v44-0005-cache-the-result-of-row-filter-column-validation.patchDownload
From cd7e65a869fdff0efdbb892df0dfed0b1d433f87 Mon Sep 17 00:00:00 2001
From: "houzj.fnst" <houzj.fnst@fujitsu.com>
Date: Thu, 2 Dec 2021 11:31:41 +0800
Subject: [PATCH] cache the result of row filter column validation
For publish mode "delete" "update", validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.
---
src/backend/catalog/pg_publication.c | 104 ++-----------
src/backend/executor/execReplication.c | 35 ++++-
src/backend/utils/cache/relcache.c | 173 +++++++++++++++++++---
src/include/utils/rel.h | 6 +
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 56 ++++---
src/test/regress/sql/publication.sql | 40 +++--
7 files changed, 263 insertions(+), 152 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d67023a440..b9619ef581 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -221,13 +221,6 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/* For rowfilter_walker. */
-typedef struct {
- Relation rel;
- bool check_replident; /* check if Var is bms_replident member? */
- Bitmapset *bms_replident;
-} rf_context;
-
/*
* The row filter walker checks that the row filter expression is legal.
*
@@ -260,15 +253,9 @@ typedef struct {
* We don't allow anything other than immutable built-in functions because those
* (not immutable ones) can access database and would lead to the problem (b)
* mentioned in the previous paragraph.
- *
- * Rules: Replica Identity validation
- * -----------------------------------
- * If the flag context.check_replident is true then validate that every variable
- * referenced by the filter expression is a valid member of the allowed set of
- * replica identity columns (context.bms_replindent)
*/
static bool
-rowfilter_walker(Node *node, rf_context *context)
+rowfilter_walker(Node *node, Relation relation)
{
char *forbidden = NULL;
bool too_complex = false;
@@ -283,25 +270,6 @@ rowfilter_walker(Node *node, rf_context *context)
/* User-defined types not allowed. */
if (var->vartype >= FirstNormalObjectId)
forbidden = _("user-defined types are not allowed");
-
- /* Optionally, do replica identify validation of the referenced column. */
- if (context->check_replident)
- {
- Oid relid = RelationGetRelid(context->rel);
- AttrNumber attnum = var->varattno;
-
- if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
- {
- const char *colname = get_attname(relid, attnum, false);
-
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot add relation \"%s\" to publication",
- RelationGetRelationName(context->rel)),
- errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
- colname)));
- }
- }
}
else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
|| IsA(node, NullTest))
@@ -344,74 +312,18 @@ rowfilter_walker(Node *node, rf_context *context)
if (too_complex)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errhint("only simple expressions using columns, constants and immutable system functions are allowed")
));
if (forbidden)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errdetail("%s", forbidden)
));
- return expression_tree_walker(node, rowfilter_walker, (void *)context);
-}
-
-/*
- * Check if the row-filter is valid according to the following rules:
- *
- * 1. Only certain simple node types are permitted in the expression. See
- * function rowfilter_walker for details.
- *
- * 2. If the publish operation contains "delete" or "update" then only columns
- * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
- * the row-filter WHERE clause.
- */
-static void
-rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
-{
- rf_context context = {0};
-
- context.rel = rel;
-
- /*
- * For "delete" or "update", check that filter cols are also valid replica
- * identity cols.
- */
- if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
- {
- char replica_identity = rel->rd_rel->relreplident;
-
- if (replica_identity == REPLICA_IDENTITY_FULL)
- {
- /*
- * FULL means all cols are in the REPLICA IDENTITY, so all cols are
- * allowed in the row-filter too.
- */
- }
- else
- {
- context.check_replident = true;
-
- /*
- * Find what are the cols that are part of the REPLICA IDENTITY.
- * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
- */
- if (replica_identity == REPLICA_IDENTITY_DEFAULT)
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
- else
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
- }
- }
-
- /*
- * Walk the parse-tree of this publication row filter expression and throw an
- * error if anything not permitted or unexpected is encountered.
- */
- rowfilter_walker(rfnode, &context);
-
- bms_free(context.bms_replident);
+ return expression_tree_walker(node, rowfilter_walker, (void *) relation);
}
List *
@@ -525,8 +437,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
- /* Validate the row-filter. */
- rowfilter_expr_checker(pub, targetrel, whereclause);
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d27fd..0dcc9dfe68 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -568,14 +568,45 @@ void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
PublicationActions *pubactions;
+ AttrNumber invalid_rfcol;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ return;
+
+ invalid_rfcol = RelationGetInvalRowFilterCol(rel);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns of the row
+ * filters from publications which the relation is in are part of the
+ * REPLICA IDENTITY.
+ */
+ if (invalid_rfcol != InvalidAttrNumber)
+ {
+ const char *colname = get_attname(RelationGetRelid(rel),
+ invalid_rfcol, false);
+
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+
/* If relation has replica identity we are always good. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
- OidIsValid(RelationGetReplicaIndex(rel)))
+ if (OidIsValid(RelationGetReplicaIndex(rel)))
return;
/*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index e1ea079e9e..3d4efa802c 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -5548,28 +5549,69 @@ RelationGetExclusionInfo(Relation indexRelation,
MemoryContextSwitchTo(oldcxt);
}
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcol;
+ Bitmapset *bms_replident;
+} rf_context;
+
/*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row-filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcol.
*/
-struct PublicationActions *
-GetRelationPublicationActions(Relation relation)
+static bool
+invalid_rowfilter_column_walker(Node *node, rf_context *context)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ context->bms_replident))
+ {
+ context->invalid_rfcol = attnum;
+ return true;
+ }
+ }
+
+ return expression_tree_walker(node, invalid_rowfilter_column_walker,
+ (void *) context);
+}
+
+/*
+ * Get the invalid row filter column number for the given relation.
+ *
+ * Traverse all the publications which the relation is in to get the
+ * publication actions. If the publication actions include UPDATE or DELETE,
+ * then validate that if all columns referenced in the row filter expression
+ * are part of REPLICA IDENTITY.
+ *
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, InvalidAttrNumber otherwise.
+ */
+AttrNumber
+RelationGetInvalRowFilterCol(Relation relation)
+{
+ List *puboids;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ rf_context context = { 0 };
+ PublicationActions pubactions = { 0 };
+ bool rfcol_valid = true;
+ AttrNumber invalid_rfcol = InvalidAttrNumber;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
- if (!is_publishable_relation(relation))
- return pubactions;
-
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (!is_publishable_relation(relation) || relation->rd_rfcol_valid)
+ return invalid_rfcol;
/* Fetch the publication membership info. */
puboids = GetRelationPublications(RelationGetRelid(relation));
@@ -5595,10 +5637,22 @@ GetRelationPublicationActions(Relation relation)
}
puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTITY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5608,35 +5662,105 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubactions.pubinsert |= pubform->pubinsert;
+ pubactions.pubupdate |= pubform->pubupdate;
+ pubactions.pubdelete |= pubform->pubdelete;
+ pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UDDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ *
+ * If we already found the column in row filter which is not part
+ * of REPLICA IDENTITY index, skip the validation too.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL &&
+ rfcol_valid)
+ {
+ HeapTuple rftuple;
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(RelationGetRelid(relation)),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !invalid_rowfilter_column_walker(rfnode,
+ &context);
+ invalid_rfcol = context.invalid_rfcol;
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
+ */
+ if (pubactions.pubinsert && pubactions.pubupdate &&
+ pubactions.pubdelete && pubactions.pubtruncate &&
+ !rfcol_valid)
break;
}
+ bms_free(context.bms_replident);
+
if (relation->rd_pubactions)
{
pfree(relation->rd_pubactions);
relation->rd_pubactions = NULL;
}
+ relation->rd_rfcol_valid = rfcol_valid;
+
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ memcpy(relation->rd_pubactions, &pubactions, sizeof(PublicationActions));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return invalid_rfcol;
+}
+
+/*
+ * Get publication actions for the given relation.
+ */
+struct PublicationActions *
+GetRelationPublicationActions(Relation relation)
+{
+ PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+
+ /*
+ * If not publishable, it publishes no actions. (pgoutput_change() will
+ * ignore it.)
+ */
+ if (!is_publishable_relation(relation))
+ return pubactions;
+
+ if (!relation->rd_pubactions)
+ (void) RelationGetInvalRowFilterCol(relation);
+
+ return memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
}
/*
@@ -6193,6 +6317,7 @@ load_relcache_init_file(bool shared)
rel->rd_idattr = NULL;
rel->rd_hotblockingattr = NULL;
rel->rd_pubactions = NULL;
+ rel->rd_rfcol_valid = false;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 31281279cf..84c58f9fe2 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -163,6 +163,12 @@ typedef struct RelationData
PublicationActions *rd_pubactions; /* publication actions */
+ /*
+ * true if the columns of row filters from all the publications the
+ * relation is in are part of replica identity.
+ */
+ bool rd_rfcol_valid;
+
/*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 82316bba54..1f091af904 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern AttrNumber RelationGetInvalRowFilterCol(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index d9ee9ff645..5affa973aa 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -426,21 +426,27 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "d" is not part of the PK
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
@@ -454,21 +460,29 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
@@ -476,21 +490,23 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index fcc09b1c23..559bc267f0 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -245,15 +245,21 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "d" is not part of the PK
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
@@ -269,15 +275,23 @@ CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
@@ -286,17 +300,19 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
--
2.18.4
On Wed, Dec 1, 2021 at 3:27 AM vignesh C <vignesh21@gmail.com> wrote:
Here we will not be able to do a direct comparison as we store the
transformed where clause in the pg_publication_rel table. We will have
to transform the where clause and then check. I have attached a patch
where we can check the transformed where clause and see if the where
clause is the same or not. If you are ok with this approach you could
make similar changes.
thanks for your patch, I have used the same logic with minor changes
and shared it with Peter for v44.
regards,
Ajin Cherian
Fujitsu Australia
On Tue, Nov 30, 2021 at 3:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 8:40 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Nov 29, 2021, at 7:11 AM, Amit Kapila wrote:
I don't think it is a good idea to combine the row-filter from the
publication that publishes just 'insert' with the row-filter that
publishes 'updates'. We shouldn't apply the 'insert' filter for
'update' and similarly for publication operations. We can combine the
filters when the published operations are the same. So, this means
that we might need to cache multiple row-filters but I think that is
better than having another restriction that publish operation 'insert'
should also honor RI columns restriction.That's exactly what I meant to say but apparently I didn't explain in details.
If a subscriber has multiple publications and a table is part of these
publications with different row filters, it should check the publication action
*before* including it in the row filter list. It means that an UPDATE operation
cannot apply a row filter that is part of a publication that has only INSERT as
an action. Having said that we cannot always combine multiple row filter
expressions into one. Instead, it should cache individual row filter expression
and apply the OR during the row filter execution (as I did in the initial
patches before this caching stuff). The other idea is to have multiple caches
for each action. The main disadvantage of this approach is to create 4x
entries.I'm experimenting the first approach that stores multiple row filters and its
publication action right now.We can try that way but I think we should still be able to combine in
many cases like where all the operations are specified for
publications having the table or maybe pubactions are same. So, we
should not give up on those cases. We can do this new logic only when
we find that pubactions are different and probably store them as
independent expressions and corresponding pubactions for it at the
current location in the v42* patch (in pgoutput_row_filter). It is
okay to combine them at a later stage during execution when we can't
do it at the time of forming cache entry.
PSA a new v44* patch set.
It includes a new patch 0006 which implements the idea above.
ExprState cache logic is basically all the same as before (including
all the OR combining), but there are now 4x ExprState caches keyed and
separated by the 4x different pubactions.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v44-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v44-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From d4b7347ec6018df74333bf354dd4d81309b86e47 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 2 Dec 2021 16:33:31 +1100
Subject: [PATCH v44] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 15 +++++++++++++--
3 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5a2094d..3696ad2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4264,6 +4264,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4274,9 +4275,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4285,6 +4293,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4325,6 +4334,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4395,8 +4408,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608..0842a3c 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca..c1591f4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,11 +2785,14 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v44-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v44-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From f60dedf0613dcf8a14c05084862b4877c0b11fda Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 2 Dec 2021 16:30:37 +1100
Subject: [PATCH v44] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 35 +++--
src/backend/replication/pgoutput/pgoutput.c | 194 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 209 insertions(+), 38 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..6b55a94 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,11 +751,12 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
@@ -771,7 +774,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (slot == NULL || TTS_EMPTY(slot))
+ {
+ values = (Datum *) palloc(desc->natts * sizeof(Datum));
+ isnull = (bool *) palloc(desc->natts * sizeof(bool));
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3b85915..0ccffa7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -132,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +172,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +744,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -857,16 +961,34 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
*/
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -885,7 +1007,12 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -898,7 +1025,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -956,6 +1082,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -964,7 +1093,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -995,9 +1124,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1020,8 +1150,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1031,7 +1180,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1449,6 +1598,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v44-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v44-0002-PS-Row-filter-validation-walker.patchDownload
From beb8a9749a36f4192713acadfdc09cc2e73c94f4 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 2 Dec 2021 16:29:23 +1100
Subject: [PATCH v44] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifially:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr, NullTest
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
---
src/backend/catalog/pg_publication.c | 198 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 14 ++-
src/backend/parser/parse_expr.c | 22 ++--
src/backend/parser/parse_func.c | 6 +-
src/backend/parser/parse_oper.c | 7 --
src/test/regress/expected/publication.out | 144 +++++++++++++++++++---
src/test/regress/sql/publication.sql | 106 +++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 448 insertions(+), 57 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 89d00cd..d67023a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -219,10 +221,199 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident;
+} rf_context;
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ * - IS NULL is allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-defnition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because those
+ * (not immutable ones) can access database and would lead to the problem (b)
+ * mentioned in the previous paragraph.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
*/
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ forbidden = _("user-defined types are not allowed");
+
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
+ || IsA(node, NullTest))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf("user-defined functions are not allowed: %s",
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf("system functions that are not IMMUTABLE are not allowed: %s",
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -333,6 +524,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..f65a86f 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,10 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +950,10 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..d8627b9 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -1777,7 +1766,10 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
/*
@@ -3100,7 +3092,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..4e4557f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,11 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
+ pstate->p_hasTargetSRFs = true;
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6959675..d9ee9ff 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +312,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,19 +355,41 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish="insert");
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: user-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -387,6 +411,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 40198fc..fcc09b1 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish="insert" because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +165,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = "insert");
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,13 +184,31 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish="insert");
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -208,6 +228,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v44-0005-cache-the-result-of-row-filter-column-validation.patchapplication/octet-stream; name=v44-0005-cache-the-result-of-row-filter-column-validation.patchDownload
From 2dae9360279b2b263fec9244acd9317163964f84 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 2 Dec 2021 16:36:00 +1100
Subject: [PATCH v44] cache the result of row filter column validation
For publish mode "delete" "update", validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.
---
src/backend/catalog/pg_publication.c | 104 ++----------------
src/backend/executor/execReplication.c | 35 +++++-
src/backend/utils/cache/relcache.c | 173 +++++++++++++++++++++++++-----
src/include/utils/rel.h | 6 ++
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 56 ++++++----
src/test/regress/sql/publication.sql | 40 ++++---
7 files changed, 263 insertions(+), 152 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d67023a..b9619ef 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -221,13 +221,6 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/* For rowfilter_walker. */
-typedef struct {
- Relation rel;
- bool check_replident; /* check if Var is bms_replident member? */
- Bitmapset *bms_replident;
-} rf_context;
-
/*
* The row filter walker checks that the row filter expression is legal.
*
@@ -260,15 +253,9 @@ typedef struct {
* We don't allow anything other than immutable built-in functions because those
* (not immutable ones) can access database and would lead to the problem (b)
* mentioned in the previous paragraph.
- *
- * Rules: Replica Identity validation
- * -----------------------------------
- * If the flag context.check_replident is true then validate that every variable
- * referenced by the filter expression is a valid member of the allowed set of
- * replica identity columns (context.bms_replindent)
*/
static bool
-rowfilter_walker(Node *node, rf_context *context)
+rowfilter_walker(Node *node, Relation relation)
{
char *forbidden = NULL;
bool too_complex = false;
@@ -283,25 +270,6 @@ rowfilter_walker(Node *node, rf_context *context)
/* User-defined types not allowed. */
if (var->vartype >= FirstNormalObjectId)
forbidden = _("user-defined types are not allowed");
-
- /* Optionally, do replica identify validation of the referenced column. */
- if (context->check_replident)
- {
- Oid relid = RelationGetRelid(context->rel);
- AttrNumber attnum = var->varattno;
-
- if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
- {
- const char *colname = get_attname(relid, attnum, false);
-
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot add relation \"%s\" to publication",
- RelationGetRelationName(context->rel)),
- errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
- colname)));
- }
- }
}
else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
|| IsA(node, NullTest))
@@ -344,74 +312,18 @@ rowfilter_walker(Node *node, rf_context *context)
if (too_complex)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errhint("only simple expressions using columns, constants and immutable system functions are allowed")
));
if (forbidden)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errdetail("%s", forbidden)
));
- return expression_tree_walker(node, rowfilter_walker, (void *)context);
-}
-
-/*
- * Check if the row-filter is valid according to the following rules:
- *
- * 1. Only certain simple node types are permitted in the expression. See
- * function rowfilter_walker for details.
- *
- * 2. If the publish operation contains "delete" or "update" then only columns
- * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
- * the row-filter WHERE clause.
- */
-static void
-rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
-{
- rf_context context = {0};
-
- context.rel = rel;
-
- /*
- * For "delete" or "update", check that filter cols are also valid replica
- * identity cols.
- */
- if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
- {
- char replica_identity = rel->rd_rel->relreplident;
-
- if (replica_identity == REPLICA_IDENTITY_FULL)
- {
- /*
- * FULL means all cols are in the REPLICA IDENTITY, so all cols are
- * allowed in the row-filter too.
- */
- }
- else
- {
- context.check_replident = true;
-
- /*
- * Find what are the cols that are part of the REPLICA IDENTITY.
- * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
- */
- if (replica_identity == REPLICA_IDENTITY_DEFAULT)
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
- else
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
- }
- }
-
- /*
- * Walk the parse-tree of this publication row filter expression and throw an
- * error if anything not permitted or unexpected is encountered.
- */
- rowfilter_walker(rfnode, &context);
-
- bms_free(context.bms_replident);
+ return expression_tree_walker(node, rowfilter_walker, (void *) relation);
}
List *
@@ -525,8 +437,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
- /* Validate the row-filter. */
- rowfilter_expr_checker(pub, targetrel, whereclause);
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d2..0dcc9df 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -568,14 +568,45 @@ void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
PublicationActions *pubactions;
+ AttrNumber invalid_rfcol;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ return;
+
+ invalid_rfcol = RelationGetInvalRowFilterCol(rel);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns of the row
+ * filters from publications which the relation is in are part of the
+ * REPLICA IDENTITY.
+ */
+ if (invalid_rfcol != InvalidAttrNumber)
+ {
+ const char *colname = get_attname(RelationGetRelid(rel),
+ invalid_rfcol, false);
+
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+
/* If relation has replica identity we are always good. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
- OidIsValid(RelationGetReplicaIndex(rel)))
+ if (OidIsValid(RelationGetReplicaIndex(rel)))
return;
/*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index e1ea079..3d4efa8 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -5548,28 +5549,69 @@ RelationGetExclusionInfo(Relation indexRelation,
MemoryContextSwitchTo(oldcxt);
}
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcol;
+ Bitmapset *bms_replident;
+} rf_context;
+
/*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row-filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcol.
*/
-struct PublicationActions *
-GetRelationPublicationActions(Relation relation)
+static bool
+invalid_rowfilter_column_walker(Node *node, rf_context *context)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ context->bms_replident))
+ {
+ context->invalid_rfcol = attnum;
+ return true;
+ }
+ }
+
+ return expression_tree_walker(node, invalid_rowfilter_column_walker,
+ (void *) context);
+}
+
+/*
+ * Get the invalid row filter column number for the given relation.
+ *
+ * Traverse all the publications which the relation is in to get the
+ * publication actions. If the publication actions include UPDATE or DELETE,
+ * then validate that if all columns referenced in the row filter expression
+ * are part of REPLICA IDENTITY.
+ *
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, InvalidAttrNumber otherwise.
+ */
+AttrNumber
+RelationGetInvalRowFilterCol(Relation relation)
+{
+ List *puboids;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ rf_context context = { 0 };
+ PublicationActions pubactions = { 0 };
+ bool rfcol_valid = true;
+ AttrNumber invalid_rfcol = InvalidAttrNumber;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
- if (!is_publishable_relation(relation))
- return pubactions;
-
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (!is_publishable_relation(relation) || relation->rd_rfcol_valid)
+ return invalid_rfcol;
/* Fetch the publication membership info. */
puboids = GetRelationPublications(RelationGetRelid(relation));
@@ -5595,10 +5637,22 @@ GetRelationPublicationActions(Relation relation)
}
puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTITY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5608,35 +5662,105 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubactions.pubinsert |= pubform->pubinsert;
+ pubactions.pubupdate |= pubform->pubupdate;
+ pubactions.pubdelete |= pubform->pubdelete;
+ pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UDDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ *
+ * If we already found the column in row filter which is not part
+ * of REPLICA IDENTITY index, skip the validation too.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL &&
+ rfcol_valid)
+ {
+ HeapTuple rftuple;
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(RelationGetRelid(relation)),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !invalid_rowfilter_column_walker(rfnode,
+ &context);
+ invalid_rfcol = context.invalid_rfcol;
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
+ */
+ if (pubactions.pubinsert && pubactions.pubupdate &&
+ pubactions.pubdelete && pubactions.pubtruncate &&
+ !rfcol_valid)
break;
}
+ bms_free(context.bms_replident);
+
if (relation->rd_pubactions)
{
pfree(relation->rd_pubactions);
relation->rd_pubactions = NULL;
}
+ relation->rd_rfcol_valid = rfcol_valid;
+
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ memcpy(relation->rd_pubactions, &pubactions, sizeof(PublicationActions));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return invalid_rfcol;
+}
+
+/*
+ * Get publication actions for the given relation.
+ */
+struct PublicationActions *
+GetRelationPublicationActions(Relation relation)
+{
+ PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+
+ /*
+ * If not publishable, it publishes no actions. (pgoutput_change() will
+ * ignore it.)
+ */
+ if (!is_publishable_relation(relation))
+ return pubactions;
+
+ if (!relation->rd_pubactions)
+ (void) RelationGetInvalRowFilterCol(relation);
+
+ return memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
}
/*
@@ -6193,6 +6317,7 @@ load_relcache_init_file(bool shared)
rel->rd_idattr = NULL;
rel->rd_hotblockingattr = NULL;
rel->rd_pubactions = NULL;
+ rel->rd_rfcol_valid = false;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 3128127..84c58f9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -164,6 +164,12 @@ typedef struct RelationData
PublicationActions *rd_pubactions; /* publication actions */
/*
+ * true if the columns of row filters from all the publications the
+ * relation is in are part of replica identity.
+ */
+ bool rd_rfcol_valid;
+
+ /*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
* defaults".
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 82316bb..1f091af 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern AttrNumber RelationGetInvalRowFilterCol(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index d9ee9ff..5affa97 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -426,21 +426,27 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "d" is not part of the PK
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
@@ -454,21 +460,29 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
@@ -476,21 +490,23 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index fcc09b1..559bc26 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -245,15 +245,21 @@ DROP PUBLICATION testpub6;
-- ok - "b" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
DROP PUBLICATION testpub6;
--- fail - "c" is not part of the PK
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "d" is not part of the PK
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
--- fail - "a" is not part of REPLICA IDENTITY
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
@@ -269,15 +275,23 @@ CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
+SET client_min_messages = 'ERROR';
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
--- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
--- fail - "a" is not in REPLICA IDENTITY NOTHING
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
@@ -286,17 +300,19 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
--- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
DROP PUBLICATION testpub6;
RESET client_min_messages;
--
1.8.3.1
v44-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v44-0001-Row-filter-for-logical-replication.patchDownload
From 9525ffc30cc54e6ea191ecabf9f7772ed1d44608 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 2 Dec 2021 16:20:51 +1100
Subject: [PATCH v44] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. The WHERE clause does not allow
user-defined functions / operators / types; it also does not allow built-in
functions unless they are immutable.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 28 ++-
doc/src/sgml/ref/create_subscription.sgml | 17 ++
src/backend/catalog/pg_publication.c | 62 ++++-
src/backend/commands/publicationcmds.c | 105 ++++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 344 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 27 ++-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++++++
src/test/regress/sql/publication.sql | 75 ++++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++++++
23 files changed, 1347 insertions(+), 49 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..01247d7 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of row-filter <literal>WHERE</literal> for <literal>DROP</literal> clause is
+ not allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e..d950316 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,16 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>UPDATE</command> and/or <command>DELETE</command> operations must
+ contain only columns that are covered by <literal>REPLICA IDENTITY</literal>. The
+ row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>INSERT</command> can use any column. The <literal>WHERE</literal>
+ clause does not allow user-defined functions / operators / types; it also does not allow
+ built-in functions unless they are immutable.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -247,6 +261,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -260,6 +279,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..8453467 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,10 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -319,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored during data synchronization.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2..89d00cd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -253,22 +256,51 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
return result;
}
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *transformedwhereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ transformedwhereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (bfixupcollation)
+ assign_expr_collations(pstate, transformedwhereclause);
+
+ return transformedwhereclause;
+}
+
/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +321,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /* Fix up collation information */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +347,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +369,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e9..6373fa2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -497,6 +497,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ Node *oldrelwhereclause = NULL;
/*
* It is quite possible that for the SET case user has not specified any
@@ -529,40 +530,92 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
+
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +952,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +980,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1032,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1041,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1061,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1158,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee..be9c1fb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33b..8e96d54
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause (row-filter) must be stored here
+ * but it is valid only for tables. If the ColId was
+ * mistakenly not a table this will be detected later
+ * in preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* Row filters are not allowed on schema objects. */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid to use WHERE (row-filter) for a schema"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..af73b14 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..3b85915 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +156,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +165,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +647,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ ReleaseSysCache(rftuple);
+ }
+ else
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ ReleaseSysCache(rftuple);
+ break;
+ }
+
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +932,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +956,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +963,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +996,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1030,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1099,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1421,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1445,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1554,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1660,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1686,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1696,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1716,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..8be5643 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /* Also display the publication row-filter (if any) for this table */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6332,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6466,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e..4a25222 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,13 +124,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558..6959675 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: invalid to use WHERE (row-filter) for a schema
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435..40198fc 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish="insert");
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish="insert");
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v44-0006-Cache-ExprState-per-pubaction.patchapplication/octet-stream; name=v44-0006-Cache-ExprState-per-pubaction.patchDownload
From 208598e51c7492a432dde25b0c600c8d4719a445 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 2 Dec 2021 17:58:34 +1100
Subject: [PATCH v44] Cache ExprState per pubaction.
If a subscriber has multiple publications and these publications include the
same table then there can be multiple filters that apply to that table.
These filters are stored per-pubactions of the publications. There are 4 kinds
of pubaction (insert, update, delete truncate).
Filters for the same pubaction are all combined (OR'ed) and cached as one, so
at the end there are at most 4 cached filters per table.
The appropriate (pubaction) filter is executed according to the DML operation.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1%2BhVXfOSScbf5LUB%3D5is%3DwYaC6NBhLxuvetbWQnZRnsVQ%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 183 +++++++++++++++++++---------
1 file changed, 124 insertions(+), 59 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 0ccffa7..5a169d9 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -130,10 +130,15 @@ typedef struct RelationSyncEntry
* The flag 'rowfilter_valid' indicates if the exprstate has been assigned
* yet or not. We cannot just use the exprstate value for this purpose
* because there might be no filter at all for the current relid (e.g.
- * exprstate is NULL).
+ * every exprstate is NULL).
+ * The row-filter exprstate is stored per pubaction type.
*/
bool rowfilter_valid;
- ExprState *exprstate; /* ExprState for row filter(s) */
+#define IDX_PUBACTION_INSERT 0
+#define IDX_PUBACTION_UPDATE 1
+#define IDX_PUBACTION_DELETE 2
+#define IDX_PUBACTION_TRUNCATE 3
+ ExprState *exprstate[4]; /* ExprState for row filter(s). One per pubaction. */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
@@ -175,10 +180,10 @@ static EState *create_estate_for_relation(Relation rel);
static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(int idx_pubaction, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, TupleTableSlot *slot,
RelationSyncEntry *entry);
-static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter_update_check(int idx_pubaction, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry,
ReorderBufferChangeType *action);
@@ -755,7 +760,7 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+pgoutput_row_filter_update_check(int idx_pubaction, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
TupleDesc desc = RelationGetDescr(relation);
int i;
@@ -763,7 +768,7 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
/* Bail out if there is no row filter */
- if (!entry->exprstate)
+ if (!entry->exprstate[idx_pubaction])
return true;
/* update requires a new tuple */
@@ -780,7 +785,7 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
if (!oldtuple)
{
*action = REORDER_BUFFER_CHANGE_UPDATE;
- return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ return pgoutput_row_filter(idx_pubaction, relation, NULL, newtuple, NULL, entry);
}
old_slot = entry->old_tuple;
@@ -827,8 +832,8 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
}
- old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
- new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+ old_matched = pgoutput_row_filter(idx_pubaction, relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(idx_pubaction, relation, NULL, NULL, tmp_new_slot, entry);
if (!old_matched && !new_matched)
return false;
@@ -850,8 +855,8 @@ static void
pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
ListCell *lc;
- List *rfnodes = NIL;
- int n_filters;
+ List *rfnodes[4] = {NIL, NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[4] = {false, false, false, false}; /* One per pubaction */
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -907,7 +912,7 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
bool rfisnull;
/*
- * Lookup if there is a row-filter, and if yes remember it in a list.
+ * Lookup if there is a row-filter, and if yes remember it in a list (per pubaction).
* In code following this 'publications' loop we will combine all filters.
*/
rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
@@ -920,56 +925,108 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
Node *rfnode;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
- rfnodes = lappend(rfnodes, rfnode);
+ /* Gather the rfnodes per pubaction of this publiaction. */
+ if (pub->pubactions.pubinsert)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_INSERT] = lappend(rfnodes[IDX_PUBACTION_INSERT], rfnode);
+ }
+ if (pub->pubactions.pubupdate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_UPDATE] = lappend(rfnodes[IDX_PUBACTION_UPDATE], rfnode);
+ }
+ if (pub->pubactions.pubdelete)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_DELETE] = lappend(rfnodes[IDX_PUBACTION_DELETE], rfnode);
+ }
+ if (pub->pubactions.pubtruncate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_TRUNCATE] = lappend(rfnodes[IDX_PUBACTION_TRUNCATE], rfnode);
+ }
MemoryContextSwitchTo(oldctx);
-
- ReleaseSysCache(rftuple);
}
else
{
- /*
- * If there is no row-filter, then any other row-filters for this table
- * also have no effect (because filters get OR-ed together) so we can
- * just discard anything found so far and exit early from the publications
- * loop.
- */
- if (rfnodes)
- {
- list_free_deep(rfnodes);
- rfnodes = NIL;
- }
- ReleaseSysCache(rftuple);
- break;
+ /* Remember which pubactions have no row-filter. */
+ if (pub->pubactions.pubinsert)
+ no_filter[IDX_PUBACTION_INSERT] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[IDX_PUBACTION_UPDATE] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[IDX_PUBACTION_DELETE] = true;
+ if (pub->pubactions.pubinsert)
+ no_filter[IDX_PUBACTION_TRUNCATE] = true;
}
+ ReleaseSysCache(rftuple);
}
} /* loop all subscribed publications */
/*
- * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ * Now all the filters for all pubactions are known, let's try to combine them
+ * when their pubactions are same.
*/
- n_filters = list_length(rfnodes);
- if (n_filters > 0)
{
- Node *rfnode;
- TupleDesc tupdesc = RelationGetDescr(relation);
+ int idx;
+ bool found_filters = false;
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
- entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ /* For each pubaction... */
+ for (idx = 0; idx < 4; idx++)
+ {
+ int n_filters;
- /*
- * Create tuple table slots for row filter. Create a copy of the
- * TupleDesc as it needs to live as long as the cache remains.
- */
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- MemoryContextSwitchTo(oldctx);
+ /*
+ * If one or more publications with this pubaction had no filter at all,
+ * then that nullifies the effect of all other filters for the same
+ * pubaction (because filters get OR'ed together).
+ */
+ if (no_filter[idx])
+ {
+ if (rfnodes[idx])
+ {
+ list_free_deep(rfnodes[idx]);
+ rfnodes[idx] = NIL;
+ }
+ }
+
+ /*
+ * If there was one or more filter for this pubaction then combine them
+ * (if necessary) and cache the ExprState.
+ */
+ n_filters = list_length(rfnodes[idx]);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) : linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ found_filters = true; /* flag that we will need slots made */
+ }
+ } /* for each pubaction */
+
+ if (found_filters)
+ {
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ MemoryContextSwitchTo(oldctx);
+ }
}
entry->rowfilter_valid = true;
@@ -982,7 +1039,7 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+pgoutput_row_filter(int idx_pubaction, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
RelationSyncEntry *entry)
{
EState *estate;
@@ -991,7 +1048,7 @@ RelationSyncEntry *entry)
Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
- if (!entry->exprstate)
+ if (!entry->exprstate[idx_pubaction])
return true;
if (message_level_is_interesting(DEBUG3))
@@ -1016,12 +1073,12 @@ RelationSyncEntry *entry)
/*
* NOTE: Multiple publication row-filters have already been combined to a
- * single exprstate.
+ * single exprstate (for this pubaction).
*/
- if (entry->exprstate)
+ if (entry->exprstate[idx_pubaction])
{
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[idx_pubaction], ecxt);
}
/* Cleanup allocated resources */
@@ -1093,7 +1150,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
+ if (!pgoutput_row_filter(IDX_PUBACTION_INSERT, relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -1126,7 +1183,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ if (!pgoutput_row_filter_update_check(IDX_PUBACTION_UPDATE, relation, oldtuple, newtuple, relentry,
&modified_action))
break;
@@ -1180,7 +1237,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
+ if (!pgoutput_row_filter(IDX_PUBACTION_DELETE, relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1601,7 +1658,10 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->new_tuple = NULL;
entry->old_tuple = NULL;
entry->tmp_new_tuple = NULL;
- entry->exprstate = NULL;
+ entry->exprstate[IDX_PUBACTION_INSERT] = NULL;
+ entry->exprstate[IDX_PUBACTION_UPDATE] = NULL;
+ entry->exprstate[IDX_PUBACTION_DELETE] = NULL;
+ entry->exprstate[IDX_PUBACTION_TRUNCATE] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1768,6 +1828,7 @@ static void
rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ int idx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1822,10 +1883,14 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
- if (entry->exprstate != NULL)
+ /* Cleanup the ExprState for each of the pubactions. */
+ for (idx = 0; idx < 4; idx++)
{
- pfree(entry->exprstate);
- entry->exprstate = NULL;
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
}
}
}
--
1.8.3.1
On Thu, Dec 2, 2021 at 9:29 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Thur, Dec 2, 2021 5:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v44* set of patches.
The following review comments are addressed:
v44-0001 main patch
- Renamed the TAP test 026->027 due to clash caused by recent commit [1]
- Refactored table_close [Houz 23/11] #2
- Alter compare where clauses [Amit 24/11] #0
- PG docs CREATE SUBSCRIPTION [Tang 30/11] #2
- PG docs CREATE PUBLICATION [Vignesh 30/11] #1, #4, [Tang 30/11] #1, [Tomas
23/9] #2v44-0002 validation walker
- Add NullTest support [Peter 18/11]
- Update comments [Amit 24/11] #3
- Disallow user-defined types [Amit 24/11] #4
- Errmsg - skipped because handled by top-up [Vignesh 23/11] #2
- Removed #if 0 [Vignesh 30/11] #2v44-0003 new/old tuple
- NAv44-0004 tab-complete and pgdump
- Handle table-list commas better [Vignesh 23/11] #2v44-0005 top-up patch for validation
- (This patch will be added again later)Attach the v44-0005 top-up patch.
Thanks for the updated patch, few comments:
1) Both testpub5a and testpub5c publication are same, one of them can be removed
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1)
WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3)
WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
testpub5b will be covered in the earlier existing case above:
ALTER PUBLICATION testpib_ins_trunct ADD TABLE pub_test.testpub_nopk,
testpub_tbl1;
\d+ pub_test.testpub_nopk
\d+ testpub_tbl1
I felt test related to testpub5b is also not required
2) testpub5 and testpub_syntax2 are similar, one of them can be removed:
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1,
testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1,
testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
3) testpub7 can be renamed to testpub6 to maintain the continuity
since the previous testpub6 did not succeed:
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer,
RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA
testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
4) Did this test intend to include where clause in testpub_rf_tb16, if
so it can be added:
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA
testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
5) It should be removed from typedefs.list too:
-/* For rowfilter_walker. */
-typedef struct {
- Relation rel;
- bool check_replident; /* check if Var is
bms_replident member? */
- Bitmapset *bms_replident;
-} rf_context;
-
Regards,
Vignesh
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
PSA a new v44* patch set.
It includes a new patch 0006 which implements the idea above.
ExprState cache logic is basically all the same as before (including
all the OR combining), but there are now 4x ExprState caches keyed and
separated by the 4x different pubactions.
row filter is not applied for TRUNCATEs so it is just 3 operations.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Fri, Dec 3, 2021 at 12:59 AM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
PSA a new v44* patch set.
It includes a new patch 0006 which implements the idea above.
ExprState cache logic is basically all the same as before (including
all the OR combining), but there are now 4x ExprState caches keyed and
separated by the 4x different pubactions.row filter is not applied for TRUNCATEs so it is just 3 operations.
Correct. The patch 0006 comment/code will be updated for this point in
the next version posted.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Thu, Dec 2, 2021 at 2:32 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Thursday, December 2, 2021 5:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v44* set of patches.
Thanks for the new patch. Few comments:
1. This is an example in publication doc, but in fact it's not allowed. Should we
change this example?+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
postgres=# CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
ERROR: invalid publication WHERE expression for relation "departments"
HINT: only simple expressions using columns, constants and immutable system functions are allowed
Thanks for finding this. Actually, the documentation looks correct to
me. The problem was the validation walker of patch 0002 was being
overly restrictive. It needed to also allow a BooleanTest node.
Now it works (locally) for me. For example.
test_pub=# create table departments(depno int primary key, active boolean);
CREATE TABLE
test_pub=# create publication pdept for table departments where
(active is true) with (publish="insert");
CREATE PUBLICATION
test_pub=# create publication pdept2 for table departments where
(active is false) with (publish="insert");
CREATE PUBLICATION
This fix will be available in v45*.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Thu, Dec 2, 2021 at 6:18 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA a new v44* patch set.
Some initial comments:
0001
src/backend/replication/logical/tablesync.c
(1) In fetch_remote_table_info update, "list_free(*qual);" should be
"list_free_deep(*qual);"
doc/src/sgml/ref/create_subscription.sgml
(2) Refer to Notes
Perhaps a link to the Notes section should be used here, as follows:
- copied. Refer to the Notes section below.
+ copied. Refer to the <xref
linkend="sql-createsubscription-notes"/> section below.
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
0002
1) Typo in patch comment
"Specifially"
src/backend/catalog/pg_publication.c
2) bms_replident comment
Member "Bitmapset *bms_replident;" in rf_context should have a
comment, maybe something like "set of replica identity col indexes".
3) errdetail message
In rowfilter_walker(), the "forbidden" errdetail message is loaded
using gettext() in one instance, but just a raw formatted string in
other cases. Shouldn't they all consistently be translated strings?
0003
src/backend/replication/logical/proto.c
1) logicalrep_write_tuple
(i)
if (slot == NULL || TTS_EMPTY(slot))
can be replaced with:
if (TupIsNull(slot))
(ii) In the above case (where values and nulls are palloc'd),
shouldn't the values and nulls be pfree()d at the end of the function?
0005
src/backend/utils/cache/relcache.c
(1) RelationGetInvalRowFilterCol
Shouldn't "rfnode" be pfree()d after use?
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
PSA a new v44* patch set.
We are actively developing this feature for some months and we improved this
feature a lot. This has been a good team work. It seems a good time to provide
a retrospective for this feature based on the consensus we reached until now.
The current design has one row filter per publication-table mapping. It allows
flexible choices while using the same table for multiple replication purposes.
The WHERE clause was chosen as the syntax to declare the row filter expression
(enclosed by parentheses).
There was a lot of discussion about which columns are allowed to use in the row
filter expression. The consensus was that publications that publish UPDATE
and/or DELETE operations, should check if the columns in the row filter
expression is part of the replica identity. Otherwise, these DML operations
couldn't be replicated.
We also discussed about which expression would be allowed. We couldn't allow
all kind of expressions because the way logical decoding infrastructure was
designed, some expressions could break the replication. Hence, we decided to
allow only "simple expressions". By "simple expression", we mean to restrict
(a) user-defined objects (functions, operators, types) and (b) immutable
builtin functions.
A subscription can subscribe to multiple publications. These publication can
publish the same table. In this case, we have to combine the row filter
expression to decide if the row will be replicated or not. The consensus was to
replicate a row if any of the row filters returns true. It means that if one
publication-table mapping does not have a row filter, the row will be
replicated. There is an optimization for this case that provides an empty
expression for this table. Hence, it bails out and replicate the row without
running the row filter code.
The same logic applies to the initial table synchronization if there are
multiple row filters. Copy all rows that satisfies at least one row filter
expression. If the subscriber is a pre-15 version, data synchronization won't
use row filters if they are defined in the publisher.
If we are dealing with partitioned tables, the publication parameter
publish_via_partition_root determines if it uses the partition row filter
(false) or the root partitioned table row filter (true).
I used the last patch series (v44) posted by Peter Smith [1]/messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com. I did a lot of
improvements in this new version (v45). I merged 0001 (it is basically the main
patch I wrote) and 0004 (autocomplete). As I explained in [2]/messages/by-id/ca8d270d-f930-4d15-9f24-60f95b364173@www.fastmail.com, I implemented a
patch (that is incorporated in the v45-0001) to fix this issue. I saw that
Peter already proposed a slightly different patch (0006). I read this patch and
concludes that it would be better to keep the version I have. It fixes a few
things and also includes more comments. I attached another patch (v45-0002)
that includes the expression validation. It is based on 0002. I completely
overhaul it. There are additional expressions that was not supported by the
previous version (such as conditional expressions [CASE, COALESCE, NULLIF,
GREATEST, LEAST], array operators, XML operators). I probably didn't finish the
supported node list (there are a few primitive nodes that need to be checked).
However, the current "simple expression" routine seems promising. I plan to
integrate v45-0002 in the next patch version. I attached it here for comparison
purposes only.
My next step is to review 0003. As I said before it would like to treat it as a
separate feature. I know that it is useful for data consistency but this patch
is already too complex. Having said that, I didn't include it in this patch
series because it doesn't apply cleanly. If Ajin would like to provide a new
version, I would appreciate.
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed some changes
but I will fix in the next version.
[1]: /messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com
[2]: /messages/by-id/ca8d270d-f930-4d15-9f24-60f95b364173@www.fastmail.com
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v45-0001-Row-filter-for-logical-replication.patchtext/x-patch; name=v45-0001-Row-filter-for-logical-replication.patchDownload
From 23a7b4685eb9039c4f84f445dc124474c0805d39 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 22 Nov 2021 15:02:19 -0300
Subject: [PATCH v45 1/2] Row filter for logical replication
This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of
the primary key or that are covered by REPLICA IDENTITY. Otherwise, any
DELETEs won't be replicated. DELETE uses the old row version (that is
limited to primary key or REPLICA IDENTITY) to evaluate the row filter.
INSERT and UPDATE use the new row version to evaluate the row filter,
hence, you can use any column. If the row filter evaluates to NULL, it
returns false. For simplicity, functions are not allowed; it could
possibly be addressed in a future patch.
If you choose to do the initial table synchronization, only data that
satisfies the row filters is sent. If the subscription has several
publications in which a table has been published with different WHERE
clauses, rows must satisfy all expressions to be copied. If subscriber
is a pre-15 version, data synchronization won't use row filters if they
are defined in the publisher. Previous versions cannot handle row
filters.
If your publication contains a partitioned table, the publication
parameter publish_via_partition_root determines if it uses the partition
row filter (if the parameter is false, the default) or the root
partitioned table row filter.
Discussion: https://postgr.es/m/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 33 +-
doc/src/sgml/ref/create_subscription.sgml | 17 +
src/backend/catalog/pg_publication.c | 48 ++-
src/backend/commands/publicationcmds.c | 76 +++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/replication/logical/tablesync.c | 118 +++++-
src/backend/replication/pgoutput/pgoutput.c | 378 +++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 30 +-
src/bin/psql/tab-complete.c | 27 +-
src/include/catalog/pg_publication.h | 3 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 148 ++++++++
src/test/regress/sql/publication.sql | 75 ++++
src/test/subscription/t/027_row_filter.pl | 362 +++++++++++++++++++
26 files changed, 1397 insertions(+), 53 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be73f..af6b1f684c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e5e2..5d9869c4f6 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e77a..b71dfced3b 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -225,6 +229,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
+ be replicated. That's because old row is used and it only contains primary
+ key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
+ remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
+ and <command>UPDATE</command> operations, any column might be used in the
+ <literal>WHERE</literal> clause. New row is used and it contains all
+ columns. A <literal>NULL</literal> value causes the expression to evaluate
+ to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
+ not allow functions or user-defined operators.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -247,6 +266,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -259,6 +283,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f1a1..6aaad517a2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,10 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -319,6 +323,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored during the initial table synchronization phase.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2f82..651e719f81 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -257,18 +260,22 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +296,30 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +333,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +355,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e95f6..bade24f12a 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,64 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * Remove tables that are not found in the new table list. Remove
+ * tables that are being re-added with a different qual expression
+ * (including a table that has no qual expression) because simply
+ * updating the existing tuple is not enough due to qual expression
+ * dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelInfo *oldrel;
ListCell *newlc;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true; /* default if tuple is not found */
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
+
+ /*
+ * Keep the table iif old and new table has no qual
+ * expression. Otherwise, this table will be included in the
+ * deletion list below.
+ */
+ if (oldrelid == RelationGetRelid(newpubrel->relation) &&
+ newpubrel->whereClause == NULL && rfisnull)
{
found = true;
break;
}
}
- /* Not yet in the list, open it and add to the list */
+
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +923,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +951,30 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any WHERE clauses. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant WHERE clauses for table \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1007,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1016,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1036,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1133,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("cannot use a WHERE clause when removing a table from publication")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee715..be9c1fbf32 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3eb96..57764470dc 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33bd97..c9ccbf31af
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a05a9..193c87d8b7 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477154..3d43839b35 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f9167aa..29bebb73eb 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23afc..29f8835ce1 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a43c..9041847087 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,82 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row filter expressions for the same table will be combined
+ * by COPY using OR. If one of the multiple row filter expressions in
+ * this table has no filter, it means the whole table will be copied.
+ * Hence, it is not required to inform an unified row filter
+ * expression at all.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ /*
+ * One entry without a row filter expression means clean up
+ * previous expressions (if there is any) and return with no
+ * expressions.
+ */
+ if (isnull)
+ {
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ ExecClearTuple(slot);
+ break;
+ }
+ else
+ {
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+ ExecClearTuple(slot);
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +889,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +898,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +909,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +929,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203dea..c97e3da642 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,28 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -115,6 +125,17 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ /*
+ * ExprState cannot be used to indicate no cache, invalid cache and valid
+ * cache. The flag exprstate_valid indicates if the current cache is
+ * valid. Multiple ExprState entries might be used if there are multiple
+ * publications for a single table. Different publication actions don't
+ * allow multiple expressions to always be combined into one.
+ */
+ bool exprstate_valid;
+ ExprState *exprstate[3]; /* ExprState array for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
/*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
@@ -137,7 +158,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +167,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry, int action);
+
/*
* Specify output plugin callbacks
*/
@@ -620,6 +648,292 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an
+ * EState. It should probably be another function in the executor to
+ * handle the execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, int action)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = false;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * The flag exprstate_valid indicates the row filter cache state. An
+ * invalid state means there is no cache yet or there is no row filter for
+ * this relation.
+ *
+ * Once the row filter is cached, it will be executed again only if there
+ * is a relation entry invalidation. Hence, it seems fine to cache it
+ * here.
+ *
+ * This code was not added to function get_rel_sync_entry() to avoid
+ * updating the cache even if it was not changed. Besides that, it
+ * postpones caching the expression near to the moment it will be used. It
+ * means that it won't waste cycles in changes (such as truncate) that
+ * won't use it or code paths that will eventually bail out without using
+ * this cache.
+ */
+ if (!entry->exprstate_valid)
+ {
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+ List *rfnodes[3] = {NIL, NIL, NIL}; /* one Node List per
+ * publication operation */
+ bool rf_in_all_pubs[3] = {true, true, true}; /* row filter in all
+ * publications? */
+ Node *rfnode;
+ int i;
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * Multiple publications might have multiple row filters for this
+ * relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Append multiple row filters according to the DML operation. If
+ * an entry does not have a row filter, remember this information
+ * (rf_in_all_pubs). It is used to discard all row filter
+ * expressions for that DML operation and, as a result, bail out
+ * through a fast path before initializing the state to process
+ * the row filter expression.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (rfisnull)
+ {
+ if (pub->pubactions.pubinsert)
+ rf_in_all_pubs[0] = false; /* INSERT */
+ if (pub->pubactions.pubupdate)
+ rf_in_all_pubs[1] = false; /* UPDATE */
+ if (pub->pubactions.pubdelete)
+ rf_in_all_pubs[2] = false; /* DELETE */
+ }
+ else
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ if (pub->pubactions.pubinsert)
+ rfnodes[0] = lappend(rfnodes[0], rfnode); /* INSERT */
+ if (pub->pubactions.pubupdate)
+ rfnodes[1] = lappend(rfnodes[1], rfnode); /* UPDATE */
+ if (pub->pubactions.pubdelete)
+ rfnodes[2] = lappend(rfnodes[2], rfnode); /* DELETE */
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * For each publication operation stores a single row filter
+ * expression. This expression might be used or not depending on the
+ * rf_in_all_pubs value.
+ */
+ for (i = 0; i < 3; i++)
+ {
+ int n;
+
+ /*
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because
+ * all expressionsare aggregated by the OR operator. The row
+ * filter absence means replicate all rows so a single valid
+ * expression means publish this row.
+ */
+ if (!rf_in_all_pubs[i])
+ {
+ list_free(rfnodes[i]);
+ entry->exprstate[i] = NULL;
+ continue;
+ }
+
+ n = list_length(rfnodes[i]);
+ if (n == 1)
+ rfnode = linitial(rfnodes[i]);
+ else if (n > 1)
+ rfnode = (Node *) makeBoolExpr(OR_EXPR, rfnodes[i], -1);
+ else
+ rfnode = NULL;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ if (n == 0)
+ entry->exprstate[i] = NULL;
+ else
+ entry->exprstate[i] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->exprstate_valid = true;
+ }
+
+ /*
+ * Bail out if for a certain operation there is no row filter to process.
+ * This is a fast path optimization. Read the explanation above about
+ * rf_in_all_pubs.
+ */
+ if (action == REORDER_BUFFER_CHANGE_INSERT && entry->exprstate[0] == NULL)
+ return true;
+ if (action == REORDER_BUFFER_CHANGE_UPDATE && entry->exprstate[1] == NULL)
+ return true;
+ if (action == REORDER_BUFFER_CHANGE_DELETE && entry->exprstate[2] == NULL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * Process the row filter. Multiple row filters were already combined
+ * above.
+ */
+ if (action == REORDER_BUFFER_CHANGE_INSERT)
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[0], ecxt);
+ else if (action == REORDER_BUFFER_CHANGE_UPDATE)
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[1], ecxt);
+ else if (action == REORDER_BUFFER_CHANGE_DELETE)
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[2], ecxt);
+ else
+ Assert(false);
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -647,7 +961,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +985,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +992,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry, change->action))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +1025,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry, change->action))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1059,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry, change->action))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1128,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1450,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1474,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->exprstate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate[0] = entry->exprstate[1] = entry->exprstate[2] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1583,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1340,6 +1675,8 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
*/
if (entry != NULL)
{
+ int i;
+
entry->schema_sent = false;
list_free(entry->streamed_txns);
entry->streamed_txns = NIL;
@@ -1354,6 +1691,24 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups
+ */
+ entry->exprstate_valid = false;
+ for (i = 0; i < 3; i++)
+ {
+ if (entry->exprstate[i] != NULL)
+ {
+ pfree(entry->exprstate[i]);
+ entry->exprstate[i] = NULL;
+ }
+ }
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
}
}
@@ -1365,6 +1720,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1730,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1750,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c590003f18..b029f7a398 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4265,6 +4265,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4275,9 +4276,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4286,6 +4294,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4326,6 +4335,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4396,8 +4409,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608e9c..0842a3c936 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d963a..41326c748d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,22 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " , pg_catalog.pg_class c\n"
"WHERE pr.prrelid = '%s'\n"
+ " AND c.oid = pr.prrelid\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3201,16 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ if (pset.sversion >= 150000)
+ {
+ /*
+ * Also display the publication row-filter (if any) for
+ * this table
+ */
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE (%s)", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6335,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE (%s)", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6469,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca3db..e1be254b9a 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,22 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+
+ /*
+ * "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+
+ /*
+ * "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,12 +2793,19 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /*
+ * "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
+
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
* ..."
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e6f3..fa23f09d69 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -86,6 +86,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +123,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504cbb..154bb61777 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e6b5..b38e6633fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee179082ce..d58ae6a63f 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558968..aa784350a2 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,154 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish = 'insert');
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE ((a > 1))
+ "testpub5b"
+ "testpub5c" WHERE ((a > 3))
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e < 999))
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_schema1.testpub_rf_tbl5" WHERE ((h < 999))
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas don't allow WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...ntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant WHERE clauses for table "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant WHERE clauses for table "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: cannot use a WHERE clause when removing a table from publication
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_schema2.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_schema2" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435c32..f8d70be24c 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,81 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish = 'insert');
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas don't allow WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000000..a2aa05f1e9
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,362 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4");
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20),
+ 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10),
+ 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is( $result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k'
+);
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
v45-0002-Validates-a-row-filter-expression-using-a-walker.patchtext/x-patch; name=v45-0002-Validates-a-row-filter-expression-using-a-walker.patchDownload
From f84a3549a238bb5390cbcde5db6e5d37b415ff4d Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Thu, 2 Dec 2021 22:43:22 -0300
Subject: [PATCH v45 2/2] Validates a row filter expression using a walker
function
A limited set of expressions is allowed. Check REPLICA IDENTITY if
UPDATE and/or DELETE operations are published.
Discussion: https://postgr.es/m/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
---
src/backend/catalog/pg_publication.c | 146 ++++++++++++++++++++
src/backend/parser/parse_agg.c | 8 +-
src/backend/parser/parse_expr.c | 17 +--
src/backend/parser/parse_func.c | 3 +-
src/backend/parser/parse_oper.c | 7 -
src/backend/replication/pgoutput/pgoutput.c | 2 +-
src/test/regress/expected/publication.out | 63 ++++++---
src/test/regress/sql/publication.sql | 39 ++++--
src/test/subscription/t/027_row_filter.pl | 11 +-
9 files changed, 226 insertions(+), 70 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 651e719f81..81ded9d242 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -29,6 +29,7 @@
#include "catalog/objectaddress.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_namespace.h"
+#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
@@ -36,6 +37,7 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -48,6 +50,12 @@
#include "utils/rel.h"
#include "utils/syscache.h"
+typedef struct RowFilterCheckInfo
+{
+ Relation rel;
+ PublicationActions pubactions;
+} RowFilterCheckInfo;
+
/*
* Check if relation can be in given publication and throws appropriate
* error if not.
@@ -111,6 +119,141 @@ check_publication_add_schema(Oid schemaid)
errdetail("Temporary schemas cannot be replicated.")));
}
+/*
+ * This routine checks if the row filter expression is a "simple expression".
+ * By "simple expression" it means:
+ *
+ * - simple or compound expressions;
+ * Examples:
+ * (Var Op Const)
+ * (Var Op Var)
+ * (Var Op Const) Bool (Var Op Const)
+ * - user-defined operators are not allowed;
+ * - user-defined types are not allowed;
+ * - user-defined functions are not allowed;
+ * - non-immutable builtin functions are not allowed.
+ *
+ * NOTES:
+ *
+ * User-defined functions, operators or types are not allowed because
+ * (a) if a user drops a user-defined object used in a row filter expression,
+ * the logical decoding infrastructure won't be able to recover from such pilot
+ * error even if the object is recreated again because a historic snapshot is
+ * used to execute the row filter.
+ * (b) a user-defined function can be used to access tables which could have
+ * unpleasant results because a historic snapshot is used. That's why only
+ * non-immutable functions are allowed in row filter expressions.
+ */
+static bool
+publication_row_filter_walker(Node *node, RowFilterCheckInfo *rfinfo)
+{
+ char *errormsg = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+ char replident = rfinfo->rel->rd_rel->relreplident;
+
+ /*
+ * Check replica identity if the publication allows UPDATE and/or
+ * DELETE as DML operations. REPLICA IDENTITY FULL is OK since it
+ * includes all columns in the old tuple.
+ */
+ if ((rfinfo->pubactions.pubupdate || rfinfo->pubactions.pubdelete) &&
+ replident != REPLICA_IDENTITY_FULL)
+ {
+ Bitmapset *bms_replident = NULL;
+
+ if (replident == REPLICA_IDENTITY_DEFAULT)
+ bms_replident = RelationGetIndexAttrBitmap(rfinfo->rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (replident == REPLICA_IDENTITY_INDEX)
+ bms_replident = RelationGetIndexAttrBitmap(rfinfo->rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * REPLICA IDENTITY NOTHING does not contain columns in the old
+ * tuple so it is not supported. The referenced column must be
+ * contained by REPLICA IDENTITY DEFAULT (primary key) or REPLICA
+ * IDENTITY INDEX (index columns).
+ */
+ if (replident == REPLICA_IDENTITY_NOTHING ||
+ !bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_replident))
+ {
+ const char *colname = get_attname(RelationGetRelid(rfinfo->rel), attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(rfinfo->rel)),
+ errdetail("Column \"%s\" used in the WHERE expression is not part of the replica identity.",
+ colname)));
+ }
+
+ bms_free(bms_replident);
+ }
+ else if (var->vartype >= FirstNormalObjectId)
+ {
+ errormsg = _("User-defined types are not allowed.");
+ }
+ }
+ else if (IsA(node, List) || IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr) || IsA(node, NullTest) || IsA(node, BooleanTest) || IsA(node, CoalesceExpr) || IsA(node, CaseExpr) || IsA(node, CaseTestExpr) || IsA(node, MinMaxExpr) || IsA(node, ArrayExpr) || IsA(node, ScalarArrayOpExpr) || IsA(node, XmlExpr))
+ {
+ /* nodes are part of simple expressions */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ if (((OpExpr *) node)->opno >= FirstNormalObjectId)
+ errormsg = _("User-defined operators are not allowed.");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *) node)->funcid;
+ const char *funcname = get_func_name(funcid);
+
+ if (funcid >= FirstNormalObjectId)
+ errormsg = psprintf(_("User-defined functions are not allowed (%s)."), funcname);
+ else if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ errormsg = psprintf(_("Non-immutable functions are not allowed (%s)."), funcname);
+ }
+ else
+ {
+ elog(WARNING, "unexpected node: %s", nodeToString(node));
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(rfinfo->rel)),
+ errdetail("Expressions only allows columns, constants and some builtin functions and operators.")));
+ }
+
+ if (errormsg)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(rfinfo->rel)),
+ errdetail("%s", errormsg)));
+
+ return expression_tree_walker(node, publication_row_filter_walker, (void *) rfinfo);
+}
+
+/*
+ * Validate the row filter with the following rules:
+ * (a) few node types are allowed in the expression. See the function
+ * publication_row_filter_walker for details.
+ * (b) If the publication publishes UPDATE and/or DELETE operations, all
+ * columns used in the row filter must be contained in the replica identity.
+ */
+static void
+check_publication_row_filter(PublicationActions pubactions, Relation rel, Node *rfnode)
+{
+ RowFilterCheckInfo rfinfo = {0};
+
+ rfinfo.rel = rel;
+ rfinfo.pubactions = pubactions;
+
+ publication_row_filter_walker(rfnode, &rfinfo);
+}
+
/*
* Returns if relation represented by oid and Form_pg_class entry
* is publishable.
@@ -317,6 +460,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
EXPR_KIND_PUBLICATION_WHERE,
"PUBLICATION");
+ /* Validate row filter expression */
+ check_publication_row_filter(pub->pubactions, targetrel, whereclause);
+
/* Fix up collation information */
assign_expr_collations(pstate, whereclause);
}
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d8b7..7388bbdbd4 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
+ /* okay (see function publication_row_filter_walker) */
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +947,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
+ /* okay (see function publication_row_filter_walker) */
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839b35..f1bb01c80d 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -1777,7 +1766,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
+ /* okay (see function publication_row_filter_walker) */
break;
/*
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb73eb..ffdf86fc7e 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,8 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
+ /* okay (see function publication_row_filter_walker) */
+ pstate->p_hasTargetSRFs = true;
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835ce1..bc34a23afc 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index c97e3da642..0f704792e4 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -857,7 +857,7 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
/*
* All row filter expressions will be discarded if there is one
* publication-relation entry without a row filter. That's because
- * all expressionsare aggregated by the OR operator. The row
+ * all expressions are aggregated by the OR operator. The row
* filter absence means replicate all rows so a single valid
* expression means publish this row.
*/
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index aa784350a2..d978ee7014 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -243,18 +243,19 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE TABLE testpub_rf_tbl5 (a xml);
CREATE SCHEMA testpub_rf_schema1;
CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
CREATE SCHEMA testpub_rf_schema2;
CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -264,7 +265,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE (((c <> 'test'::text) AND (d < 5)))
@@ -275,7 +276,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (((e > 1000) AND (e < 2000)))
@@ -286,7 +287,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE (((e > 300) AND (e < 500)))
@@ -310,26 +311,26 @@ Publications:
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e < 999))
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_schema1.testpub_rf_tbl5" WHERE ((h < 999))
@@ -353,20 +354,36 @@ ERROR: conflicting or redundant WHERE clauses for table "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
ERROR: conflicting or redundant WHERE clauses for table "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: User-defined operators are not allowed.
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: User-defined functions are not allowed (testpub_rf_func2).
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Non-immutable functions are not allowed (random).
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
ERROR: cannot use a WHERE clause when removing a table from publication
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -379,6 +396,7 @@ DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_tbl5;
DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
DROP SCHEMA testpub_rf_schema1;
@@ -386,7 +404,8 @@ DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index f8d70be24c..7ab0ef7e63 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -138,12 +138,13 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE TABLE testpub_rf_tbl5 (a xml);
CREATE SCHEMA testpub_rf_schema1;
CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
CREATE SCHEMA testpub_rf_schema2;
CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -163,12 +164,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub5a, testpub5b, testpub5c;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -182,14 +183,30 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
@@ -200,6 +217,7 @@ DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_tbl5;
DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
DROP SCHEMA testpub_rf_schema1;
@@ -207,7 +225,8 @@ DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index a2aa05f1e9..affd206e8a 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -18,6 +18,8 @@ $node_subscriber->start;
# setup structure on publisher
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
@@ -231,14 +233,10 @@ $node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
-$node_publisher->safe_psql('postgres',
- "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
$node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
$node_publisher->safe_psql('postgres',
"UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
-$node_publisher->safe_psql('postgres',
- "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
$node_publisher->safe_psql('postgres',
@@ -281,12 +279,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
-# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -295,7 +289,6 @@ is( $result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
--
2.20.1
This is great work thanks for the Realisation Update.
Euler Taveira <euler@eulerto.com> schrieb am Sa., 4. Dez. 2021, 00:13:
Show quoted text
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
PSA a new v44* patch set.
We are actively developing this feature for some months and we improved
this
feature a lot. This has been a good team work. It seems a good time to
provide
a retrospective for this feature based on the consensus we reached until
now.The current design has one row filter per publication-table mapping. It
allows
flexible choices while using the same table for multiple replication
purposes.
The WHERE clause was chosen as the syntax to declare the row filter
expression
(enclosed by parentheses).There was a lot of discussion about which columns are allowed to use in
the row
filter expression. The consensus was that publications that publish UPDATE
and/or DELETE operations, should check if the columns in the row filter
expression is part of the replica identity. Otherwise, these DML operations
couldn't be replicated.We also discussed about which expression would be allowed. We couldn't
allow
all kind of expressions because the way logical decoding infrastructure was
designed, some expressions could break the replication. Hence, we decided
to
allow only "simple expressions". By "simple expression", we mean to
restrict
(a) user-defined objects (functions, operators, types) and (b) immutable
builtin functions.A subscription can subscribe to multiple publications. These publication
can
publish the same table. In this case, we have to combine the row filter
expression to decide if the row will be replicated or not. The consensus
was to
replicate a row if any of the row filters returns true. It means that if
one
publication-table mapping does not have a row filter, the row will be
replicated. There is an optimization for this case that provides an empty
expression for this table. Hence, it bails out and replicate the row
without
running the row filter code.The same logic applies to the initial table synchronization if there are
multiple row filters. Copy all rows that satisfies at least one row filter
expression. If the subscriber is a pre-15 version, data synchronization
won't
use row filters if they are defined in the publisher.If we are dealing with partitioned tables, the publication parameter
publish_via_partition_root determines if it uses the partition row filter
(false) or the root partitioned table row filter (true).I used the last patch series (v44) posted by Peter Smith [1]. I did a lot
of
improvements in this new version (v45). I merged 0001 (it is basically the
main
patch I wrote) and 0004 (autocomplete). As I explained in [2], I
implemented a
patch (that is incorporated in the v45-0001) to fix this issue. I saw that
Peter already proposed a slightly different patch (0006). I read this
patch and
concludes that it would be better to keep the version I have. It fixes a
few
things and also includes more comments. I attached another patch (v45-0002)
that includes the expression validation. It is based on 0002. I completely
overhaul it. There are additional expressions that was not supported by the
previous version (such as conditional expressions [CASE, COALESCE, NULLIF,
GREATEST, LEAST], array operators, XML operators). I probably didn't
finish the
supported node list (there are a few primitive nodes that need to be
checked).
However, the current "simple expression" routine seems promising. I plan to
integrate v45-0002 in the next patch version. I attached it here for
comparison
purposes only.My next step is to review 0003. As I said before it would like to treat it
as a
separate feature. I know that it is useful for data consistency but this
patch
is already too complex. Having said that, I didn't include it in this patch
series because it doesn't apply cleanly. If Ajin would like to provide a
new
version, I would appreciate.PS> I will update the commit message in the next version. I barely changed
the
documentation to reflect the current behavior. I probably missed some
changes
but I will fix in the next version.[1]
/messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com
[2]
/messages/by-id/ca8d270d-f930-4d15-9f24-60f95b364173@www.fastmail.com--
Euler Taveira
EDB https://www.enterprisedb.com/
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed some changes
but I will fix in the next version.
I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluate the
row filter for UPDATEs. We already discussed it circa [1]/messages/by-id/202107162135.m5ehijgcasjk@alvherre.pgsql. This current version
chooses *new* tuple. Is it the best choice?
Let's check all cases. There are 2 rows on the provider. One row satisfies the
row filter and the other one doesn't. For each case, I expect the initial rows
to be there (no modifications). The DDLs are:
CREATE TABLE foo (a integer, b text, PRIMARY KEY(a));
INSERT INTO foo (a, b) VALUES(10, 'abc'),(30, 'abc');
CREATE PUBLICATION bar FOR TABLE foo WHERE (a > 20);
The table describes what happen on the subscriber. BEFORE is the current row on
subscriber. OLD, NEW and OLD & NEW are action/row if we consider different ways
to evaluate the row filter.
-- case 1: old tuple (10, abc) ; new tuple (10, def)
UPDATE foo SET b = 'def' WHERE a = 10;
+-----------+--------------------+------------------+------------------+
| BEFORE | OLD | NEW | OLD & NEW |
+-----------+--------------------+------------------+------------------+
| NA | NA | NA | NA |
+-----------+--------------------+------------------+------------------+
If the old and new tuple don't satisfy the row filter, there is no issue.
-- case 2: old tuple (30, abc) ; new tuple (30, def)
UPDATE foo SET b = 'def' WHERE a = 30;
+-----------+--------------------+------------------+------------------+
| BEFORE | OLD | NEW | OLD & NEW |
+-----------+--------------------+------------------+------------------+
| (30, abc) | UPDATE (30, def) | UPDATE (30, def) | UPDATE (30, def) |
+-----------+--------------------+------------------+------------------+
If the old and new tuple satisfy the row filter, there is no issue.
-- case 3: old tuple (30, abc) ; new tuple (10, def)
UPDATE foo SET a = 10, b = 'def' WHERE a = 30;
+-----------+--------------------+------------------+------------------+
| BEFORE | OLD | NEW | OLD & NEW |
+-----------+--------------------+------------------+------------------+
| (30, abc) | UPDATE (10, def) * | KEEP (30, abc) * | KEEP (30, abc) * |
+-----------+--------------------+------------------+------------------+
If the old tuple satisfies the row filter but the new tuple doesn't, we have a
data consistency issue. Since the old tuple satisfies the row filter, the
initial table synchronization copies this row. However, after the UPDATE the
new tuple doesn't satisfy the row filter then, from the data consistency
perspective, that row should be removed on the subscriber.
The OLD sends the UPDATE because it satisfies the row filter (if it is a
sharding solution this new row should be moved to another node). The new row
would likely not be modified by replication again. That's a data inconsistency
according to the row filter.
The NEW and OLD & NEW don't send the UPDATE because it doesn't satisfy the row
filter. Keep the old row is undesirable because it doesn't reflect what we have
on the source. This row on the subscriber would likely not be modified by
replication again. If someone inserted a new row with a = 30, replication will
stop because there is already a row with that value.
-- case 4: old tuple (10, abc) ; new tuple (30, def)
UPDATE foo SET a = 30, b = 'def' WHERE a = 10;
+-----------+--------------------+------------------+------------------+
| BEFORE | OLD | NEW | OLD & NEW |
+-----------+--------------------+------------------+------------------+
| NA | NA ! | NA ! | NA |
+-----------+--------------------+------------------+------------------+
The OLD and OLD & NEW don't send the UPDATE because it doesn't satisfy the row
filter. The NEW sends the UPDATE because it satisfies the row filter but there
is no row to modify. The current behavior does nothing. However, it should
INSERT the new tuple. Subsequent UPDATE or DELETE have no effect. It could be a
surprise for an application that expects the same data set from the provider.
If we have to choose the default behavior I would say use the old tuple for
evaluates row filter. Why? The validation already restricts the columns to
replica identity so there isn't an issues with missing (NULL) columns. The case
3 updates the row with a value that is not consistent but keeping the old row
is worse because it could stop the replication if someone inserted the old key
in a new row on the provider. The case 4 ignores the UPDATE if it cannot find
the tuple but it could provide an error if there was an strict mode.
Since this change is very simple to revert, this new version contains this
modification. I also improve the documentation, remove extra parenthesis from
psql/pg_dump. As I said in the previous email, I merged the validation patch too.
FWIW in the previous version, I removed a code that compares nodes to decide if
it is necessary to remove the publication-relation entry. I had a similar code
in a ancient version of this patch but decided that the additional code is not
worth.
There is at least one issue in the current code that should be addressed: PK or
REPLICA IDENTITY modification could break the publication check for UPDATEs and
DELETEs.
[1]: /messages/by-id/202107162135.m5ehijgcasjk@alvherre.pgsql
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
0001-Row-filter-for-logical-replication.patchtext/x-patch; name=0001-Row-filter-for-logical-replication.patchDownload
From 69ddc44c006d29436d8b5a912ac2386f6be460e0 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 22 Nov 2021 15:02:19 -0300
Subject: [PATCH] Row filter for logical replication
This feature adds row filter for publication tables. When a publication is
defined or modified, rows that don't satisfy an optional WHERE clause will be
filtered out. This allows a database or set of tables to be partially
replicated. The row filter is per table. A new row filter can be added simply
by specifying a WHERE clause after the table name. The WHERE clause must be
enclosed by parentheses.
The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any UPDATEs and
DELETEs won't be replicated. For UPDATE and DELETE commands, it uses the old
row version (that is limited to primary key or REPLICA IDENTITY) to evaluate
the row filter. INSERT uses the new row version to evaluate the row filter,
hence, you can use any column. If the row filter evaluates to NULL, it returns
false. The WHERE clause allows simple expressions. Simple expressions cannot
contain any aggregate or window functions, non-immutable functions,
user-defined types, operators or functions. This restriction could possibly be
addressed in the future.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy any
expressions to be copied. In this case of different WHERE clauses, if one of
the expressions is not defined, rows are always replicated regardless of the
definition of the other expressions. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Discussion: https://postgr.es/m/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 35 +-
doc/src/sgml/ref/create_subscription.sgml | 23 +-
src/backend/catalog/pg_publication.c | 193 +++++++++-
src/backend/commands/publicationcmds.c | 76 +++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 +-
src/backend/parser/parse_agg.c | 6 +
src/backend/parser/parse_expr.c | 6 +
src/backend/parser/parse_func.c | 4 +
src/backend/replication/logical/tablesync.c | 118 +++++-
src/backend/replication/pgoutput/pgoutput.c | 391 +++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 24 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 26 +-
src/bin/psql/tab-complete.c | 27 +-
src/include/catalog/pg_publication.h | 3 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 167 +++++++++
src/test/regress/sql/publication.sql | 94 +++++
src/test/subscription/t/027_row_filter.pl | 355 ++++++++++++++++++
25 files changed, 1566 insertions(+), 52 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be73f..af6b1f684c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e5e2..5d9869c4f6 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e77a..b21595f955 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -225,6 +229,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
disallowed on those tables.
</para>
+ <para>
+ The <literal>WHERE</literal> clause should contain only columns that are
+ part of the primary key or be covered by <literal>REPLICA
+ IDENTITY</literal> otherwise, <command>UPDATE</command> and
+ <command>DELETE</command> operations will not be replicated. That's because
+ old row is used and it only contains primary key or columns that are part of
+ the <literal>REPLICA IDENTITY</literal>; the remaining columns are
+ <literal>NULL</literal>. For <command>INSERT</command> operations, any column
+ might be used in the <literal>WHERE</literal> clause. New row is used and it
+ contains all columns. A <literal>NULL</literal> value causes the expression
+ to evaluate to false; avoid using columns without not-null constraints in the
+ <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause allows
+ simple expressions. The simple expression cannot contain any aggregate or
+ window functions, non-immutable functions, user-defined types, operators or
+ functions.
+ </para>
+
<para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
@@ -247,6 +268,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -259,6 +285,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</programlisting>
</para>
+ <para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
<para>
Create a publication that publishes all changes in all tables:
<programlisting>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f1a1..4baa7fd781 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,13 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ If the publications contain conditional expressions, it will affect
+ what data is copied. Refer to the
+ <xref linkend="sql-createsubscription-notes" /> for details.
+ Row-filtering may also apply here and will affect what data is
+ copied. Refer to the Notes section below.
+ </para>
</listitem>
</varlistentry>
@@ -293,7 +300,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</variablelist>
</refsect1>
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
<title>Notes</title>
<para>
@@ -319,6 +326,20 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different
+ <literal>WHERE</literal> clauses, a row will be replicated if any of the
+ expressions is satisfied. In this case of different <literal>WHERE</literal>
+ clauses, if one of the expressions is not defined, rows are always replicated
+ regardless of the definition of the other expressions. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored during the initial table synchronization phase.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2f82..04e2270293 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -29,6 +29,7 @@
#include "catalog/objectaddress.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_namespace.h"
+#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
@@ -36,6 +37,10 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -45,6 +50,12 @@
#include "utils/rel.h"
#include "utils/syscache.h"
+typedef struct RowFilterCheckInfo
+{
+ Relation rel;
+ PublicationActions pubactions;
+} RowFilterCheckInfo;
+
/*
* Check if relation can be in given publication and throws appropriate
* error if not.
@@ -108,6 +119,140 @@ check_publication_add_schema(Oid schemaid)
errdetail("Temporary schemas cannot be replicated.")));
}
+/*
+ * This routine checks if the row filter expression is a "simple expression".
+ * By "simple expression" it means:
+ *
+ * - simple or compound expressions;
+ * Examples:
+ * (Var Op Const)
+ * (Var Op Var)
+ * (Var Op Const) Bool (Var Op Const)
+ * - user-defined operators are not allowed;
+ * - user-defined types are not allowed;
+ * - user-defined functions are not allowed;
+ * - non-immutable builtin functions are not allowed.
+ *
+ * NOTES:
+ *
+ * User-defined functions, operators or types are not allowed because
+ * (a) if a user drops a user-defined object used in a row filter expression,
+ * the logical decoding infrastructure won't be able to recover from such pilot
+ * error even if the object is recreated again because a historic snapshot is
+ * used to execute the row filter.
+ * (b) a user-defined function can be used to access tables which could have
+ * unpleasant results because a historic snapshot is used. That's why only
+ * non-immutable functions are allowed in row filter expressions.
+ */
+static bool
+publication_row_filter_walker(Node *node, RowFilterCheckInfo *rfinfo)
+{
+ char *errormsg = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+ char replident = rfinfo->rel->rd_rel->relreplident;
+
+ /*
+ * Check replica identity if the publication allows UPDATE and/or
+ * DELETE as DML operations. REPLICA IDENTITY FULL is OK since it
+ * includes all columns in the old tuple.
+ */
+ if ((rfinfo->pubactions.pubupdate || rfinfo->pubactions.pubdelete) &&
+ replident != REPLICA_IDENTITY_FULL)
+ {
+ Bitmapset *bms_replident = NULL;
+
+ if (replident == REPLICA_IDENTITY_DEFAULT)
+ bms_replident = RelationGetIndexAttrBitmap(rfinfo->rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (replident == REPLICA_IDENTITY_INDEX)
+ bms_replident = RelationGetIndexAttrBitmap(rfinfo->rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+
+ /*
+ * REPLICA IDENTITY NOTHING does not contain columns in the old
+ * tuple so it is not supported. The referenced column must be
+ * contained by REPLICA IDENTITY DEFAULT (primary key) or REPLICA
+ * IDENTITY INDEX (index columns).
+ */
+ if (replident == REPLICA_IDENTITY_NOTHING ||
+ !bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, bms_replident))
+ {
+ const char *colname = get_attname(RelationGetRelid(rfinfo->rel), attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(rfinfo->rel)),
+ errdetail("Column \"%s\" used in the WHERE expression is not part of the replica identity.",
+ colname)));
+ }
+
+ bms_free(bms_replident);
+ }
+ else if (var->vartype >= FirstNormalObjectId)
+ {
+ errormsg = _("User-defined types are not allowed.");
+ }
+ }
+ else if (IsA(node, List) || IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr) || IsA(node, NullTest) || IsA(node, BooleanTest) || IsA(node, CoalesceExpr) || IsA(node, CaseExpr) || IsA(node, CaseTestExpr) || IsA(node, MinMaxExpr) || IsA(node, ArrayExpr) || IsA(node, ScalarArrayOpExpr) || IsA(node, XmlExpr))
+ {
+ /* nodes are part of simple expressions */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ if (((OpExpr *) node)->opno >= FirstNormalObjectId)
+ errormsg = _("User-defined operators are not allowed.");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *) node)->funcid;
+ const char *funcname = get_func_name(funcid);
+
+ if (funcid >= FirstNormalObjectId)
+ errormsg = psprintf(_("User-defined functions are not allowed (%s)."), funcname);
+ else if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ errormsg = psprintf(_("Non-immutable functions are not allowed (%s)."), funcname);
+ }
+ else
+ {
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(rfinfo->rel)),
+ errdetail("Expressions only allows columns, constants and some builtin functions and operators.")));
+ }
+
+ if (errormsg)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(rfinfo->rel)),
+ errdetail("%s", errormsg)));
+
+ return expression_tree_walker(node, publication_row_filter_walker, (void *) rfinfo);
+}
+
+/*
+ * Validate the row filter with the following rules:
+ * (a) few node types are allowed in the expression. See the function
+ * publication_row_filter_walker for details.
+ * (b) If the publication publishes UPDATE and/or DELETE operations, all
+ * columns used in the row filter must be contained in the replica identity.
+ */
+static void
+check_publication_row_filter(PublicationActions pubactions, Relation rel, Node *rfnode)
+{
+ RowFilterCheckInfo rfinfo = {0};
+
+ rfinfo.rel = rel;
+ rfinfo.pubactions = pubactions;
+
+ publication_row_filter_walker(rfnode, &rfinfo);
+}
+
/*
* Returns if relation represented by oid and Form_pg_class entry
* is publishable.
@@ -257,18 +402,22 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -289,10 +438,33 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION");
+
+ /* Validate row filter expression */
+ check_publication_row_filter(pub->pubactions, targetrel, whereclause);
+
+ /* Fix up collation information */
+ assign_expr_collations(pstate, whereclause);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -306,6 +478,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -322,6 +500,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e95f6..bade24f12a 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,64 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * Remove tables that are not found in the new table list. Remove
+ * tables that are being re-added with a different qual expression
+ * (including a table that has no qual expression) because simply
+ * updating the existing tuple is not enough due to qual expression
+ * dependencies.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
+ PublicationRelInfo *oldrel;
ListCell *newlc;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true; /* default if tuple is not found */
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
- if (RelationGetRelid(newpubrel->relation) == oldrelid)
+
+ /*
+ * Keep the table iif old and new table has no qual
+ * expression. Otherwise, this table will be included in the
+ * deletion list below.
+ */
+ if (oldrelid == RelationGetRelid(newpubrel->relation) &&
+ newpubrel->whereClause == NULL && rfisnull)
{
found = true;
break;
}
}
- /* Not yet in the list, open it and add to the list */
+
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +923,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +951,30 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ char *relname = pstrdup(RelationGetRelationName(rel));
+
table_close(rel, ShareUpdateExclusiveLock);
+
+ /* Disallow duplicate tables if there are any WHERE clauses. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant WHERE clauses for table \"%s\"",
+ relname)));
+
+ pfree(relname);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1007,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1016,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1036,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1133,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("cannot use a WHERE clause when removing a table from publication")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee715..be9c1fbf32 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3eb96..57764470dc 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33bd97..c9ccbf31af
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a05a9..7388bbdbd4 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,9 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ /* okay (see function publication_row_filter_walker) */
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +946,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ /* okay (see function publication_row_filter_walker) */
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477154..f1bb01c80d 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -504,6 +504,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1765,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ /* okay (see function publication_row_filter_walker) */
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3088,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f9167aa..ffdf86fc7e 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,10 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ /* okay (see function publication_row_filter_walker) */
+ pstate->p_hasTargetSRFs = true;
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a43c..9041847087 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,82 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row filter expressions for the same table will be combined
+ * by COPY using OR. If one of the multiple row filter expressions in
+ * this table has no filter, it means the whole table will be copied.
+ * Hence, it is not required to inform an unified row filter
+ * expression at all.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ /*
+ * One entry without a row filter expression means clean up
+ * previous expressions (if there is any) and return with no
+ * expressions.
+ */
+ if (isnull)
+ {
+ if (*qual)
+ {
+ list_free(*qual);
+ *qual = NIL;
+ }
+ ExecClearTuple(slot);
+ break;
+ }
+ else
+ {
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+ ExecClearTuple(slot);
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +889,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +898,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +909,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +929,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203dea..cfd41526d4 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,18 +13,28 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -115,6 +125,17 @@ typedef struct RelationSyncEntry
bool replicate_valid;
PublicationActions pubactions;
+ /*
+ * ExprState cannot be used to indicate no cache, invalid cache and valid
+ * cache. The flag exprstate_valid indicates if the current cache is
+ * valid. Multiple ExprState entries might be used if there are multiple
+ * publications for a single table. Different publication actions don't
+ * allow multiple expressions to always be combined into one.
+ */
+ bool exprstate_valid;
+ ExprState *exprstate[3]; /* ExprState array for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
/*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
@@ -137,7 +158,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +167,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry, int action);
+
/*
* Specify output plugin callbacks
*/
@@ -620,6 +648,305 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
OutputPluginWrite(ctx, false);
}
+/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+ MemoryContext oldctx;
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be coerced to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an
+ * EState. It should probably be another function in the executor to
+ * handle the execution outside a normal Plan tree context.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+ MemoryContextSwitchTo(oldctx);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, int action)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = false;
+ Oid relid = RelationGetRelid(relation);
+
+ /*
+ * The flag exprstate_valid indicates the row filter cache state. An
+ * invalid state means there is no cache yet or there is no row filter for
+ * this relation.
+ *
+ * Once the row filter is cached, it will be executed again only if there
+ * is a relation entry invalidation. Hence, it seems fine to cache it
+ * here.
+ *
+ * This code was not added to function get_rel_sync_entry() to avoid
+ * updating the cache even if it was not changed. Besides that, it
+ * postpones caching the expression near to the moment it will be used. It
+ * means that it won't waste cycles in changes (such as truncate) that
+ * won't use it or code paths that will eventually bail out without using
+ * this cache.
+ */
+ if (!entry->exprstate_valid)
+ {
+ MemoryContext oldctx;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+ List *rfnodes[3] = {NIL, NIL, NIL}; /* one Node List per
+ * publication operation */
+ bool rf_in_all_pubs[3] = {true, true, true}; /* row filter in all
+ * publications? */
+ Node *rfnode;
+ int i;
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains. Release the tuple table slot if it
+ * already exists.
+ */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+
+ /*
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * Multiple publications might have multiple row filters for this
+ * relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Append multiple row filters according to the DML operation. If
+ * an entry does not have a row filter, remember this information
+ * (rf_in_all_pubs). It is used to discard all row filter
+ * expressions for that DML operation and, as a result, bail out
+ * through a fast path before initializing the state to process
+ * the row filter expression.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (rfisnull)
+ {
+ if (pub->pubactions.pubinsert)
+ rf_in_all_pubs[0] = false; /* INSERT */
+ if (pub->pubactions.pubupdate)
+ rf_in_all_pubs[1] = false; /* UPDATE */
+ if (pub->pubactions.pubdelete)
+ rf_in_all_pubs[2] = false; /* DELETE */
+ }
+ else
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ if (pub->pubactions.pubinsert)
+ rfnodes[0] = lappend(rfnodes[0], rfnode); /* INSERT */
+ if (pub->pubactions.pubupdate)
+ rfnodes[1] = lappend(rfnodes[1], rfnode); /* UPDATE */
+ if (pub->pubactions.pubdelete)
+ rfnodes[2] = lappend(rfnodes[2], rfnode); /* DELETE */
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * For each publication operation stores a single row filter
+ * expression. This expression might be used or not depending on the
+ * rf_in_all_pubs value.
+ */
+ for (i = 0; i < 3; i++)
+ {
+ int n;
+
+ /*
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because
+ * all expressions are aggregated by the OR operator. The row
+ * filter absence means replicate all rows so a single valid
+ * expression means publish this row.
+ */
+ if (!rf_in_all_pubs[i])
+ {
+ list_free(rfnodes[i]);
+ entry->exprstate[i] = NULL;
+ continue;
+ }
+
+ n = list_length(rfnodes[i]);
+ if (n == 1)
+ rfnode = linitial(rfnodes[i]);
+ else if (n > 1)
+ rfnode = (Node *) makeBoolExpr(OR_EXPR, rfnodes[i], -1);
+ else
+ rfnode = NULL;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ if (n == 0)
+ entry->exprstate[i] = NULL;
+ else
+ entry->exprstate[i] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->exprstate_valid = true;
+ }
+
+ /*
+ * Bail out if for a certain operation there is no row filter to process.
+ * This is a fast path optimization. Read the explanation above about
+ * rf_in_all_pubs.
+ */
+ if (action == REORDER_BUFFER_CHANGE_INSERT && entry->exprstate[0] == NULL)
+ return true;
+ if (action == REORDER_BUFFER_CHANGE_UPDATE && entry->exprstate[1] == NULL)
+ return true;
+ if (action == REORDER_BUFFER_CHANGE_DELETE && entry->exprstate[2] == NULL)
+ return true;
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ /*
+ * The default behavior for UPDATEs is to use the old tuple for row
+ * filtering.
+ * 1) The columns used in the expression is restricted to REPLICA IDENTITY.
+ * It means that all column values are available to evaluate the
+ * expression.
+ * 2) If the old tuple satisfies the row filter but the new tuple doesn't,
+ * there is a data consistency issue. That is worse when the new tuple is
+ * used (keep old row that could eventually conflicts with a new row
+ * inserted in the future) instead of the old tuple (modify the row on
+ * subscriber that couldn't be changed by the replication again due to row
+ * filter expression).
+ */
+ ExecStoreHeapTuple(oldtuple ? oldtuple : newtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * Process the row filter. Multiple row filters were already combined
+ * above.
+ */
+ if (action == REORDER_BUFFER_CHANGE_INSERT)
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[0], ecxt);
+ else if (action == REORDER_BUFFER_CHANGE_UPDATE)
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[1], ecxt);
+ else if (action == REORDER_BUFFER_CHANGE_DELETE)
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[2], ecxt);
+ else
+ Assert(false);
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
/*
* Sends the decoded DML over wire.
*
@@ -647,7 +974,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +998,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +1005,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry, change->action))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +1038,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry, change->action))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1072,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry, change->action))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1141,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1463,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1487,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->exprstate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate[0] = entry->exprstate[1] = entry->exprstate[2] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1596,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1340,6 +1688,8 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
*/
if (entry != NULL)
{
+ int i;
+
entry->schema_sent = false;
list_free(entry->streamed_txns);
entry->streamed_txns = NIL;
@@ -1354,6 +1704,24 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups
+ */
+ entry->exprstate_valid = false;
+ for (i = 0; i < 3; i++)
+ {
+ if (entry->exprstate[i] != NULL)
+ {
+ pfree(entry->exprstate[i]);
+ entry->exprstate[i] = NULL;
+ }
+ }
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
}
}
@@ -1365,6 +1733,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1743,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1763,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c590003f18..6948ee0603 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4265,6 +4265,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4275,9 +4276,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4286,6 +4294,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4326,6 +4335,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4396,8 +4409,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE %s", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index d1d8608e9c..0842a3c936 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d963a..fb5cfc510f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,21 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
"WHERE pr.prrelid = '%s'\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3200,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ /* row filter (if any) */
+ if (pset.sversion >= 150000)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE %s", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6331,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE %s", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6465,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca3db..e1be254b9a 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,22 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+
+ /*
+ * "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+
+ /*
+ * "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,12 +2793,19 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /*
+ * "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
+
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
* ..."
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 1ae439e6f3..fa23f09d69 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -86,6 +86,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -122,7 +123,7 @@ extern List *GetPubPartitionOptionRelations(List *result,
Oid relid);
extern bool is_publishable_relation(Relation rel);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504cbb..154bb61777 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e6b5..b38e6633fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee179082ce..d58ae6a63f 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558968..3bf795756b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,173 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish = 'insert');
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub5a" WHERE (a > 1)
+ "testpub5b"
+ "testpub5c" WHERE (a > 3)
+
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (e < 999)
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | f | f | f | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_schema1.testpub_rf_tbl5" WHERE (h < 999)
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas don't allow WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...ntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant WHERE clauses for table "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant WHERE clauses for table "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: User-defined operators are not allowed.
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: User-defined functions are not allowed (testpub_rf_func2).
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Non-immutable functions are not allowed (random).
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
+ERROR: cannot use a WHERE clause when removing a table from publication
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_schema2.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_schema2" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435c32..7ab0ef7e63 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,100 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish = 'insert');
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas don't allow WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000000..5867cfd756
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,355 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4");
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20),
+ 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10),
+ 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is( $result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k'
+);
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - UPDATE (1600, NULL) YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1600|
+1601|test 1601 updated
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
2.20.1
On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed some changes
but I will fix in the next version.I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluate the
row filter for UPDATEs. We already discussed it circa [1]. This current version
chooses *new* tuple. Is it the best choice?
But with 0003, we are using both the tuple for evaluating the row
filter, so instead of fixing 0001, why we don't just merge 0003 with
0001? I mean eventually, 0003 is doing what is the agreed behavior,
i.e. if just OLD is matching the filter then convert the UPDATE to
DELETE OTOH if only new is matching the filter then convert the UPDATE
to INSERT. Do you think that even we merge 0001 and 0003 then also
there is an open issue regarding which row to select for the filter?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Sat, Dec 4, 2021 at 4:43 AM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
PSA a new v44* patch set.
We are actively developing this feature for some months and we improved this
feature a lot. This has been a good team work. It seems a good time to provide
a retrospective for this feature based on the consensus we reached until now.The current design has one row filter per publication-table mapping. It allows
flexible choices while using the same table for multiple replication purposes.
The WHERE clause was chosen as the syntax to declare the row filter expression
(enclosed by parentheses).There was a lot of discussion about which columns are allowed to use in the row
filter expression. The consensus was that publications that publish UPDATE
and/or DELETE operations, should check if the columns in the row filter
expression is part of the replica identity. Otherwise, these DML operations
couldn't be replicated.We also discussed about which expression would be allowed. We couldn't allow
all kind of expressions because the way logical decoding infrastructure was
designed, some expressions could break the replication. Hence, we decided to
allow only "simple expressions". By "simple expression", we mean to restrict
(a) user-defined objects (functions, operators, types) and (b) immutable
builtin functions.
I think what you said as (b) is wrong because we want to allow builtin
immutable functions. See discussion [1]/messages/by-id/CAA4eK1+XoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ@mail.gmail.com.
A subscription can subscribe to multiple publications. These publication can
publish the same table. In this case, we have to combine the row filter
expression to decide if the row will be replicated or not. The consensus was to
replicate a row if any of the row filters returns true. It means that if one
publication-table mapping does not have a row filter, the row will be
replicated. There is an optimization for this case that provides an empty
expression for this table. Hence, it bails out and replicate the row without
running the row filter code.
In addition to this, we have decided to have an exception/optimization
where we need to consider publish actions while combining multiple
filters as we can't combine insert/update filters.
The same logic applies to the initial table synchronization if there are
multiple row filters. Copy all rows that satisfies at least one row filter
expression. If the subscriber is a pre-15 version, data synchronization won't
use row filters if they are defined in the publisher.If we are dealing with partitioned tables, the publication parameter
publish_via_partition_root determines if it uses the partition row filter
(false) or the root partitioned table row filter (true).I used the last patch series (v44) posted by Peter Smith [1]. I did a lot of
improvements in this new version (v45). I merged 0001 (it is basically the main
patch I wrote) and 0004 (autocomplete). As I explained in [2], I implemented a
patch (that is incorporated in the v45-0001) to fix this issue. I saw that
Peter already proposed a slightly different patch (0006). I read this patch and
concludes that it would be better to keep the version I have. It fixes a few
things and also includes more comments. I attached another patch (v45-0002)
that includes the expression validation. It is based on 0002. I completely
overhaul it. There are additional expressions that was not supported by the
previous version (such as conditional expressions [CASE, COALESCE, NULLIF,
GREATEST, LEAST], array operators, XML operators). I probably didn't finish the
supported node list (there are a few primitive nodes that need to be checked).
However, the current "simple expression" routine seems promising. I plan to
integrate v45-0002 in the next patch version. I attached it here for comparison
purposes only.My next step is to review 0003. As I said before it would like to treat it as a
separate feature.
I don't think that would be right decision as we already had discussed
that in detail and reach to the current conclusion based on which
Ajin's 0003 patch is.
I know that it is useful for data consistency but this patch
is already too complex.
True, but that is the main reason the review and development are being
done as separate sub-features. I suggest still keeping the similar
separation till some of the reviews of each of the patches are done,
otherwise, we need to rethink how to divide for easier review. We need
to retain the 0005 patch because that handles many problems without
which the main patch is incomplete and buggy w.r.t replica identity.
[1]: /messages/by-id/CAA4eK1+XoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ@mail.gmail.com
--
With Regards,
Amit Kapila.
On Mon, Dec 6, 2021 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed some changes
but I will fix in the next version.I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluate the
row filter for UPDATEs. We already discussed it circa [1]. This current version
chooses *new* tuple. Is it the best choice?But with 0003, we are using both the tuple for evaluating the row
filter, so instead of fixing 0001, why we don't just merge 0003 with
0001?
I agree that would be better than coming up with an entirely new
approach especially when the current approach is discussed and agreed
upon.
I mean eventually, 0003 is doing what is the agreed behavior,
i.e. if just OLD is matching the filter then convert the UPDATE to
DELETE OTOH if only new is matching the filter then convert the UPDATE
to INSERT.
+1.
Do you think that even we merge 0001 and 0003 then also
there is an open issue regarding which row to select for the filter?
I think eventually we should merge 0001 and 0003 to avoid any sort of
data consistency but it is better to keep them separate for the
purpose of a review at this stage. If I am not wrong that still needs
bug-fix we are discussing it as part of CF entry [1]https://commitfest.postgresql.org/36/3162/, right? If so,
isn't it better to review that bug-fix patch and the 0003 patch being
discussed here [2]/messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com to avoid missing any already reported issues in
this thread?
[1]: https://commitfest.postgresql.org/36/3162/
[2]: /messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com
--
With Regards,
Amit Kapila.
On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed some changes
but I will fix in the next version.I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluate the
row filter for UPDATEs. We already discussed it circa [1]. This current version
chooses *new* tuple. Is it the best choice?
Apart from the data inconsistency problems you outlined below, I think
there is a major design problem with that w.r.t toast tuples as
unchanged key values won't be part of *new* tuple.
Let's check all cases. There are 2 rows on the provider. One row satisfies the
row filter and the other one doesn't. For each case, I expect the initial rows
to be there (no modifications). The DDLs are:CREATE TABLE foo (a integer, b text, PRIMARY KEY(a));
INSERT INTO foo (a, b) VALUES(10, 'abc'),(30, 'abc');
CREATE PUBLICATION bar FOR TABLE foo WHERE (a > 20);The table describes what happen on the subscriber. BEFORE is the current row on
subscriber. OLD, NEW and OLD & NEW are action/row if we consider different ways
to evaluate the row filter.-- case 1: old tuple (10, abc) ; new tuple (10, def)
UPDATE foo SET b = 'def' WHERE a = 10;+-----------+--------------------+------------------+------------------+ | BEFORE | OLD | NEW | OLD & NEW | +-----------+--------------------+------------------+------------------+ | NA | NA | NA | NA | +-----------+--------------------+------------------+------------------+If the old and new tuple don't satisfy the row filter, there is no issue.
-- case 2: old tuple (30, abc) ; new tuple (30, def)
UPDATE foo SET b = 'def' WHERE a = 30;+-----------+--------------------+------------------+------------------+ | BEFORE | OLD | NEW | OLD & NEW | +-----------+--------------------+------------------+------------------+ | (30, abc) | UPDATE (30, def) | UPDATE (30, def) | UPDATE (30, def) | +-----------+--------------------+------------------+------------------+If the old and new tuple satisfy the row filter, there is no issue.
-- case 3: old tuple (30, abc) ; new tuple (10, def)
UPDATE foo SET a = 10, b = 'def' WHERE a = 30;+-----------+--------------------+------------------+------------------+ | BEFORE | OLD | NEW | OLD & NEW | +-----------+--------------------+------------------+------------------+ | (30, abc) | UPDATE (10, def) * | KEEP (30, abc) * | KEEP (30, abc) * | +-----------+--------------------+------------------+------------------+If the old tuple satisfies the row filter but the new tuple doesn't, we have a
data consistency issue. Since the old tuple satisfies the row filter, the
initial table synchronization copies this row. However, after the UPDATE the
new tuple doesn't satisfy the row filter then, from the data consistency
perspective, that row should be removed on the subscriber.
This is the reason we decide to make such cases to transform UPDATE to DELETE.
The OLD sends the UPDATE because it satisfies the row filter (if it is a
sharding solution this new row should be moved to another node). The new row
would likely not be modified by replication again. That's a data inconsistency
according to the row filter.The NEW and OLD & NEW don't send the UPDATE because it doesn't satisfy the row
filter. Keep the old row is undesirable because it doesn't reflect what we have
on the source. This row on the subscriber would likely not be modified by
replication again. If someone inserted a new row with a = 30, replication will
stop because there is already a row with that value.
This shouldn't be a problem with the v44 patch version (0003 handles it).
-- case 4: old tuple (10, abc) ; new tuple (30, def)
UPDATE foo SET a = 30, b = 'def' WHERE a = 10;+-----------+--------------------+------------------+------------------+ | BEFORE | OLD | NEW | OLD & NEW | +-----------+--------------------+------------------+------------------+ | NA | NA ! | NA ! | NA | +-----------+--------------------+------------------+------------------+The OLD and OLD & NEW don't send the UPDATE because it doesn't satisfy the row
filter. The NEW sends the UPDATE because it satisfies the row filter but there
is no row to modify. The current behavior does nothing. However, it should
INSERT the new tuple. Subsequent UPDATE or DELETE have no effect. It could be a
surprise for an application that expects the same data set from the provider.
Again this is addressed by V44 as an Insert would be performed in this case.
If we have to choose the default behavior I would say use the old tuple for
evaluates row filter. Why? The validation already restricts the columns to
replica identity so there isn't an issues with missing (NULL) columns. The case
3 updates the row with a value that is not consistent but keeping the old row
is worse because it could stop the replication if someone inserted the old key
in a new row on the provider. The case 4 ignores the UPDATE if it cannot find
the tuple but it could provide an error if there was an strict mode.
Hmm, I think it is much better to translate Update to Delete in case-3
and Update to Insert in case-4 as there shouldn't be any data
consistency issues after that. All these issues have been discussed in
detail in this thread and based on that we decided to follow the v44
(0003) patch version approach. We have also investigated some other
replication solutions and they were also doing the similar
translations to avoid such issues.
Since this change is very simple to revert, this new version contains this
modification. I also improve the documentation, remove extra parenthesis from
psql/pg_dump. As I said in the previous email, I merged the validation patch too.
As said previously it might be better to keep those separate for
easier review. It is anyway better to split such a big patch for ease
of review even if in the end we combine all the work.
FWIW in the previous version, I removed a code that compares nodes to decide if
it is necessary to remove the publication-relation entry. I had a similar code
in a ancient version of this patch but decided that the additional code is not
worth.There is at least one issue in the current code that should be addressed: PK or
REPLICA IDENTITY modification could break the publication check for UPDATEs and
DELETEs.
Please see patch 0005 [1]/messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com. I think it tries to address the issues
w.r.t Replica Identity interaction with this feature. Feel free to
test/review and let us know if you see any issues.
[1]: /messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com
--
With Regards,
Amit Kapila.
On Sat, Dec 4, 2021 at 10:13 AM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
PSA a new v44* patch set.
...
I used the last patch series (v44) posted by Peter Smith [1]. I did a lot of
improvements in this new version (v45). I merged 0001 (it is basically the main
patch I wrote) and 0004 (autocomplete). As I explained in [2], I implemented a
patch (that is incorporated in the v45-0001) to fix this issue. I saw that
Peter already proposed a slightly different patch (0006). I read this patch and
concludes that it would be better to keep the version I have. It fixes a few
things and also includes more comments.
[1] /messages/by-id/CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com
[2] /messages/by-id/ca8d270d-f930-4d15-9f24-60f95b364173@www.fastmail.com
As I explained in [2], I implemented a
patch (that is incorporated in the v45-0001) to fix this issue. I saw that
Peter already proposed a slightly different patch (0006). I read this patch and
concludes that it would be better to keep the version I have. It fixes a few
things and also includes more comments.
Your ExprState exprstate array code is essentially exactly the same
logic that was int patch v44-0006 isn't it?
The main difference I saw was
1. I pass the cache index (e.g. IDX_PUBACTION_DELETE etc) to the
pgoutput_filter, but
2. You are passing in the ReorderBufferChangeType value.
IMO the ability to directly access the cache array is more efficient.
The function is called for every row operation (e.g. consider x 1
million rows) so I felt the overhead to have unnecessary if/else
should be avoided.
e.g.
------
if (action == REORDER_BUFFER_CHANGE_INSERT)
result = pgoutput_row_filter_exec_expr(entry->exprstate[0], ecxt);
else if (action == REORDER_BUFFER_CHANGE_UPDATE)
result = pgoutput_row_filter_exec_expr(entry->exprstate[1], ecxt);
else if (action == REORDER_BUFFER_CHANGE_DELETE)
result = pgoutput_row_filter_exec_expr(entry->exprstate[2], ecxt);
else
Assert(false);
------
Why not just use a direct index like was in patch v44-0006 in the first place?
e.g.
------
result = pgoutput_row_filter_exec_expr(entry->exprstate[idx_pubaction], ecxt);
------
Conveniently, those ReorderBufferChangeType first 3 enums are the ones
you want so you can still pass them if you want.
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
REORDER_BUFFER_CHANGE_DELETE,
Just use them to directly index into entry->exprstate[action] and so
remove the excessive if/else.
What do you think?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed some changes
but I will fix in the next version.I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluate the
row filter for UPDATEs. We already discussed it circa [1]. This current version
chooses *new* tuple. Is it the best choice?But with 0003, we are using both the tuple for evaluating the row
filter, so instead of fixing 0001, why we don't just merge 0003 with
0001? I mean eventually, 0003 is doing what is the agreed behavior,
i.e. if just OLD is matching the filter then convert the UPDATE to
DELETE OTOH if only new is matching the filter then convert the UPDATE
to INSERT. Do you think that even we merge 0001 and 0003 then also
there is an open issue regarding which row to select for the filter?
Maybe I was not clear. IIUC we are still discussing 0003 and I would like to
propose a different default based on the conclusion I came up. If we merged
0003, that's fine; this change will be useless. If we don't or it is optional,
it still has its merit.
Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm still
processing if it is worth it. If you think that in general the row filter
contains the primary key and it is rare to change it, it will waste cycles
evaluating the same expression twice. It seems this behavior could be
controlled by a parameter.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Dec 6, 2021, at 3:44 AM, Amit Kapila wrote:
I think what you said as (b) is wrong because we want to allow builtin
immutable functions. See discussion [1].
It was a typo. I mean "non-immutable" function.
True, but that is the main reason the review and development are being
done as separate sub-features. I suggest still keeping the similar
separation till some of the reviews of each of the patches are done,
otherwise, we need to rethink how to divide for easier review. We need
to retain the 0005 patch because that handles many problems without
which the main patch is incomplete and buggy w.r.t replica identity.
IMO we should merge sub-features as soon as we reach consensus. Every new
sub-feature breaks comments, tests and documentation if you want to remove or
rearrange patches. It seems I misread 0005. I agree that it is important. I'll
check it.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Dec 6, 2021 at 6:18 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Dec 6, 2021, at 3:44 AM, Amit Kapila wrote:
True, but that is the main reason the review and development are being
done as separate sub-features. I suggest still keeping the similar
separation till some of the reviews of each of the patches are done,
otherwise, we need to rethink how to divide for easier review. We need
to retain the 0005 patch because that handles many problems without
which the main patch is incomplete and buggy w.r.t replica identity.IMO we should merge sub-features as soon as we reach consensus. Every new
sub-feature breaks comments, tests and documentation if you want to remove or
rearrange patches.
I agree that there is some effort but OTOH, it gives the flexibility
to do a focussed review and as soon as some patch is ready or close to
ready we can merge in the main patch. This was just a humble
suggestion based on how this patch was making progress and how it has
helped to keep some parts separate by allowing different people to
work on different parts of the problem.
It seems I misread 0005. I agree that it is important. I'll
check it.
Okay, thanks!
--
With Regards,
Amit Kapila.
Hi Euler –
As you know we have been posting patch update versions to the
Row-Filter thread several times a week now for a few months. We are
carefully tracking all open review comments of the thread and fixing
as many as possible with each version posted.
~~
It is true that the multiple patches are difficult to maintain
(particular for test cases impacting other patches), but
- this is the arrangement that Amit preferred (without whose support
as a committer this patch would likely be stalled).
- separate patches have allowed us to spread the work across multiple
people to improve the velocity (e.g. the Hou-san top-up patch 0005).
- having multiple patches also allows the review comments to be more focused.
~~
We were mid-way putting together the next v45* when your latest
attachment was posted over the weekend. So we will proceed with our
original plan to post our v45* (tomorrow).
After v45* is posted we will pause to find what are all the
differences between your unified patch and our v45* patch set. Our
intention is to integrate as many improvements as possible from your
changes into the v46* etc that will follow tomorrow’s v45*. On some
points, we will most likely need further discussion.
With luck, soon everything can be more in sync again.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Friday, December 3, 2021 10:09 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Dec 2, 2021 at 2:32 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Thursday, December 2, 2021 5:21 AM Peter Smith
<smithpb2250@gmail.com> wrote:
PSA the v44* set of patches.
Thanks for the new patch. Few comments:
1. This is an example in publication doc, but in fact it's not allowed. Should we
change this example?+CREATE PUBLICATION active_departments FOR TABLE departments WHERE
(active IS TRUE);
postgres=# CREATE PUBLICATION active_departments FOR TABLE departments
WHERE (active IS TRUE);
ERROR: invalid publication WHERE expression for relation "departments"
HINT: only simple expressions using columns, constants and immutable systemfunctions are allowed
Thanks for finding this. Actually, the documentation looks correct to
me. The problem was the validation walker of patch 0002 was being
overly restrictive. It needed to also allow a BooleanTest node.Now it works (locally) for me. For example.
test_pub=# create table departments(depno int primary key, active boolean);
CREATE TABLE
test_pub=# create publication pdept for table departments where
(active is true) with (publish="insert");
CREATE PUBLICATION
test_pub=# create publication pdept2 for table departments where
(active is false) with (publish="insert");
CREATE PUBLICATIONThis fix will be available in v45*.
Thanks for looking into it.
I have another problem with your patch. The document says:
... If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant.
Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
ALL TABLES IN SCHEMA'.
For example:
create table tbl (a int primary key);"
create publication p1 for table tbl where (a > 10);
create publication p2 for all tables;
create subscription sub connection 'dbname=postgres port=5432' publication p1, p2;
I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?
But for now, the filter(a > 10) works both when copying initial data and later changes.
To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.
Regards
Tang
On Tue, Dec 7, 2021 at 12:18 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Friday, December 3, 2021 10:09 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Dec 2, 2021 at 2:32 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Thursday, December 2, 2021 5:21 AM Peter Smith
<smithpb2250@gmail.com> wrote:
PSA the v44* set of patches.
Thanks for the new patch. Few comments:
1. This is an example in publication doc, but in fact it's not allowed. Should we
change this example?+CREATE PUBLICATION active_departments FOR TABLE departments WHERE
(active IS TRUE);
postgres=# CREATE PUBLICATION active_departments FOR TABLE departments
WHERE (active IS TRUE);
ERROR: invalid publication WHERE expression for relation "departments"
HINT: only simple expressions using columns, constants and immutable systemfunctions are allowed
Thanks for finding this. Actually, the documentation looks correct to
me. The problem was the validation walker of patch 0002 was being
overly restrictive. It needed to also allow a BooleanTest node.Now it works (locally) for me. For example.
test_pub=# create table departments(depno int primary key, active boolean);
CREATE TABLE
test_pub=# create publication pdept for table departments where
(active is true) with (publish="insert");
CREATE PUBLICATION
test_pub=# create publication pdept2 for table departments where
(active is false) with (publish="insert");
CREATE PUBLICATIONThis fix will be available in v45*.
Thanks for looking into it.
I have another problem with your patch. The document says:
... If the subscription has several publications in + which the same table has been published with different filters, those + expressions get OR'ed together so that rows satisfying any of the expressions + will be replicated. Notice this means if one of the publications has no filter + at all then all other filters become redundant.Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
ALL TABLES IN SCHEMA'.For example:
create table tbl (a int primary key);"
create publication p1 for table tbl where (a > 10);
create publication p2 for all tables;
create subscription sub connection 'dbname=postgres port=5432' publication p1, p2;
Thanks for the example. I was wondering about this case myself.
I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?But for now, the filter(a > 10) works both when copying initial data and later changes.
To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.
In order to check "FOR ALL_TABLES", we might need to fetch publication
metdata. Instead of that can we add a "TRUE" filter on all the tables
which are part of FOR ALL TABLES publication?
--
Best Wishes,
Ashutosh Bapat
On Tue, Dec 7, 2021 at 6:31 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
On Tue, Dec 7, 2021 at 12:18 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:I have another problem with your patch. The document says:
... If the subscription has several publications in + which the same table has been published with different filters, those + expressions get OR'ed together so that rows satisfying any of the expressions + will be replicated. Notice this means if one of the publications has no filter + at all then all other filters become redundant.Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
ALL TABLES IN SCHEMA'.For example:
create table tbl (a int primary key);"
create publication p1 for table tbl where (a > 10);
create publication p2 for all tables;
create subscription sub connection 'dbname=postgres port=5432' publication p1, p2;Thanks for the example. I was wondering about this case myself.
I think we should handle this case.
I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?But for now, the filter(a > 10) works both when copying initial data and later changes.
To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.In order to check "FOR ALL_TABLES", we might need to fetch publication
metadata.
Do we really need to perform a separate fetch for this? In
get_rel_sync_entry(), we already have this information, can't we
someway stash that in the corresponding RelationSyncEntry so that same
can be used later for row filtering.
Instead of that can we add a "TRUE" filter on all the tables
which are part of FOR ALL TABLES publication?
How? We won't have an entry for such tables in pg_publication_rel
where we store row_filter information.
--
With Regards,
Amit Kapila.
On Thu, Dec 2, 2021 at 2:59 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
...
Attach the v44-0005 top-up patch.
This version addressed all the comments received so far,
mainly including the following changes:
1) rename rfcol_valid_for_replica to rfcol_valid
2) Remove the struct PublicationInfo and add the rfcol_valid flag directly in relation
3) report the invalid column number in the error message.
4) Rename some function to match the usage.
5) Fix some typos and add some code comments.
6) Fix a miss in testcase.
Below are my review comments for the most recent v44-0005 (top-up) patch:
======
1. src/backend/executor/execReplication.c
+ invalid_rfcol = RelationGetInvalRowFilterCol(rel);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns of the row
+ * filters from publications which the relation is in are part of the
+ * REPLICA IDENTITY.
+ */
+ if (invalid_rfcol != InvalidAttrNumber)
+ {
It seemed confusing that when the invalid_rfcol is NOT invalid at all
then it is InvalidAttrNumber, so perhaps this code would be easier to
read if instead the condition was written just as:
---
if (invalid_rfcol)
{
...
}
---
====
2. invalid_rfcol var name
This variable name is used in a few places but I thought it was too
closely named with the "rfcol_valid" variable even though it has a
completely different meaning. IMO "invalid_rfcol" might be better
named "invalid_rfcolnum" or something like that to reinforce that it
is an AttributeNumber.
====
3. src/backend/utils/cache/relcache.c - function comment
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, InvalidAttrNumber otherwise.
+ */
Minor rewording:
"InvalidAttrNumber otherwise." --> "otherwise InvalidAttrNumber."
====
4. src/backend/utils/cache/relcache.c - function name
+AttrNumber
+RelationGetInvalRowFilterCol(Relation relation)
IMO nothing was gained by saving 2 chars of the name.
"RelationGetInvalRowFilterCol" --> "RelationGetInvalidRowFilterCol"
====
5. src/backend/utils/cache/relcache.c
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcol;
+ Bitmapset *bms_replident;
+} rf_context;
+
The members should be commented.
====
6. src/include/utils/rel.h
/*
+ * true if the columns of row filters from all the publications the
+ * relation is in are part of replica identity.
+ */
+ bool rd_rfcol_valid;
I felt the member comment is not quite telling the full story. e.g.
IIUC this member is also true when pubaction is something other than
update/delete - but that case doesn't even do replica identity
checking at all. There might not even be any replica identity.
====
6. src/test/regress/sql/publication.sql
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+update rf_tbl_abcd_pk set a = 1;
+DROP PUBLICATION testpub6;
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
DROP PUBLICATION testpub6;
-RESET client_min_messages;
--- fail - "a" is not in REPLICA IDENTITY INDEX
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+update rf_tbl_abcd_nopk set a = 1;
The "update" DML should be uppercase "UPDATE" for consistency with the
surrounding tests.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Dec 6, 2021 at 6:04 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed some changes
but I will fix in the next version.I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluate the
row filter for UPDATEs. We already discussed it circa [1]. This current version
chooses *new* tuple. Is it the best choice?But with 0003, we are using both the tuple for evaluating the row
filter, so instead of fixing 0001, why we don't just merge 0003 with
0001? I mean eventually, 0003 is doing what is the agreed behavior,
i.e. if just OLD is matching the filter then convert the UPDATE to
DELETE OTOH if only new is matching the filter then convert the UPDATE
to INSERT. Do you think that even we merge 0001 and 0003 then also
there is an open issue regarding which row to select for the filter?Maybe I was not clear. IIUC we are still discussing 0003 and I would like to
propose a different default based on the conclusion I came up. If we merged
0003, that's fine; this change will be useless. If we don't or it is optional,
it still has its merit.Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm still
processing if it is worth it. If you think that in general the row filter
contains the primary key and it is rare to change it, it will waste cycles
evaluating the same expression twice. It seems this behavior could be
controlled by a parameter.
I think the first thing we should do in this regard is to evaluate the
performance for both cases (when we apply a filter to both tuples vs.
to one of the tuples). In case the performance difference is
unacceptable, I think it would be better to still compare both tuples
as default to avoid data inconsistency issues and have an option to
allow comparing one of the tuples.
--
With Regards,
Amit Kapila.
On Wed, Dec 8, 2021 at 10:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Do we really need to perform a separate fetch for this? In
get_rel_sync_entry(), we already have this information, can't we
someway stash that in the corresponding RelationSyncEntry so that same
can be used later for row filtering.Instead of that can we add a "TRUE" filter on all the tables
which are part of FOR ALL TABLES publication?How? We won't have an entry for such tables in pg_publication_rel
where we store row_filter information.
I missed that. Your solution works. Thanks.
--
Best Wishes,
Ashutosh Bapat
On Tue, Dec 7, 2021 at 5:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
We were mid-way putting together the next v45* when your latest
attachment was posted over the weekend. So we will proceed with our
original plan to post our v45* (tomorrow).After v45* is posted we will pause to find what are all the
differences between your unified patch and our v45* patch set. Our
intention is to integrate as many improvements as possible from your
changes into the v46* etc that will follow tomorrow’s v45*. On some
points, we will most likely need further discussion.
Posting an update for review comments, using contributions majorly
from Peter Smith.
I've also included changes based on Euler's combined patch, specially
changes to documentation and test cases.
I have left out Hou-san's 0005, in this patch-set. Hou-san will
provide a rebased update based on this.
This patch addresses the following review comments:
On Wed, Nov 24, 2021 at 8:52 PM vignesh C <vignesh21@gmail.com> wrote:
Few comments:
1) I'm not sure if we will be able to throw a better error message in
this case "ERROR: missing FROM-clause entry for table "t4"", if
possible you could change it.
Fixed this.
On Thu, Dec 2, 2021 at 7:40 PM vignesh C <vignesh21@gmail.com> wrote:
1) Both testpub5a and testpub5c publication are same, one of them can be removed +SET client_min_messages = 'ERROR'; +CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert"); +CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1; +CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert"); +RESET client_min_messages; +\d+ testpub_rf_tbl1 +DROP PUBLICATION testpub5a, testpub5b, testpub5c;
Fixed
On Fri, Dec 3, 2021 at 6:16 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
doc/src/sgml/ref/create_subscription.sgml
(2) Refer to NotesPerhaps a link to the Notes section should be used here, as follows:
- copied. Refer to the Notes section below. + copied. Refer to the <xref linkend="sql-createsubscription-notes"/> section below.
Fixed
1) Typo in patch comment
"Specifially"
Fixed
src/backend/catalog/pg_publication.c
2) bms_replident comment
Member "Bitmapset *bms_replident;" in rf_context should have a
comment, maybe something like "set of replica identity col indexes".
Fixed
3) errdetail message
In rowfilter_walker(), the "forbidden" errdetail message is loaded
using gettext() in one instance, but just a raw formatted string in
other cases. Shouldn't they all consistently be translated strings?
Fixed
(i)
if (slot == NULL || TTS_EMPTY(slot))
can be replaced with:
if (TupIsNull(slot))
Fixed
(ii) In the above case (where values and nulls are palloc'd),
shouldn't the values and nulls be pfree()d at the end of the function?
Fixed, changed it into fixed arrays
On Thu, Dec 2, 2021 at 2:32 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
1. This is an example in publication doc, but in fact it's not allowed. Should we
change this example?
Fixed.
2. A typo in 0002 patch.
+ * drops such a user-defnition or if there is any other error via its function,
"user-defnition" should be "user-definition".
Fixed
On Fri, Dec 3, 2021 at 12:59 AM Euler Taveira <euler@eulerto.com> wrote:
ExprState cache logic is basically all the same as before (including
all the OR combining), but there are now 4x ExprState caches keyed and
separated by the 4x different pubactions.row filter is not applied for TRUNCATEs so it is just 3 operations.
Fixed
regards,
Ajin Cherian
Fujitsu Australia
Attachments:
v45-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v45-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From 8dce8d67068315d2075abba1bf590faa658116fe Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 05:34:59 -0500
Subject: [PATCH v45 4/5] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 15 +++++++++++++--
3 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 10a86f9..e595c7f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4265,6 +4265,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4275,9 +4276,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4286,6 +4294,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4326,6 +4335,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4393,8 +4406,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 6dccb4b..74f82cd 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -633,6 +633,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca..c1591f4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,11 +2785,14 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v45-0005-Cache-ExprState-per-pubaction.patchapplication/octet-stream; name=v45-0005-Cache-ExprState-per-pubaction.patchDownload
From 8aff7339a51395bcfb3e8f52d9e4f94a648ed915 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 05:37:36 -0500
Subject: [PATCH v45 5/5] Cache ExprState per pubaction.
If a subscriber has multiple publications and these publications include the
same table then there can be multiple filters that apply to that table.
These filters are stored per-pubactions of the publications. There are 4 kinds
of pubaction ("insert", "update", "delete", "truncate"), but row-filters are
not applied for "truncate".
Filters for the same pubaction are all combined (OR'ed) and cached as one, so
at the end there are at most 3 cached filters per table.
The appropriate (pubaction) filter is executed according to the DML operation.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1%2BhVXfOSScbf5LUB%3D5is%3DwYaC6NBhLxuvetbWQnZRnsVQ%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 176 ++++++++++++++++++----------
1 file changed, 117 insertions(+), 59 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 0ccffa7..c244961 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -130,10 +130,16 @@ typedef struct RelationSyncEntry
* The flag 'rowfilter_valid' indicates if the exprstate has been assigned
* yet or not. We cannot just use the exprstate value for this purpose
* because there might be no filter at all for the current relid (e.g.
- * exprstate is NULL).
+ * every exprstate is NULL).
+ * The row-filter exprstate is stored per pubaction type (row-filters are
+ * not applied for "truncate" pubaction).
*/
bool rowfilter_valid;
- ExprState *exprstate; /* ExprState for row filter(s) */
+#define IDX_PUBACTION_INSERT 0
+#define IDX_PUBACTION_UPDATE 1
+#define IDX_PUBACTION_DELETE 2
+#define IDX_PUBACTION_n 3
+ ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState for row filter(s). One per pubaction. */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
@@ -175,10 +181,10 @@ static EState *create_estate_for_relation(Relation rel);
static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(int idx_pubaction, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, TupleTableSlot *slot,
RelationSyncEntry *entry);
-static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter_update_check(int idx_pubaction, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry,
ReorderBufferChangeType *action);
@@ -755,7 +761,7 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+pgoutput_row_filter_update_check(int idx_pubaction, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
TupleDesc desc = RelationGetDescr(relation);
int i;
@@ -763,7 +769,7 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
/* Bail out if there is no row filter */
- if (!entry->exprstate)
+ if (!entry->exprstate[idx_pubaction])
return true;
/* update requires a new tuple */
@@ -780,7 +786,7 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
if (!oldtuple)
{
*action = REORDER_BUFFER_CHANGE_UPDATE;
- return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ return pgoutput_row_filter(idx_pubaction, relation, NULL, newtuple, NULL, entry);
}
old_slot = entry->old_tuple;
@@ -827,8 +833,8 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
}
- old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
- new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+ old_matched = pgoutput_row_filter(idx_pubaction, relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(idx_pubaction, relation, NULL, NULL, tmp_new_slot, entry);
if (!old_matched && !new_matched)
return false;
@@ -850,8 +856,8 @@ static void
pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
ListCell *lc;
- List *rfnodes = NIL;
- int n_filters;
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -907,7 +913,7 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
bool rfisnull;
/*
- * Lookup if there is a row-filter, and if yes remember it in a list.
+ * Lookup if there is a row-filter, and if yes remember it in a list (per pubaction).
* In code following this 'publications' loop we will combine all filters.
*/
rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
@@ -920,56 +926,101 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
Node *rfnode;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
- rfnodes = lappend(rfnodes, rfnode);
+ /* Gather the rfnodes per pubaction of this publiaction. */
+ if (pub->pubactions.pubinsert)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_INSERT] = lappend(rfnodes[IDX_PUBACTION_INSERT], rfnode);
+ }
+ if (pub->pubactions.pubupdate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_UPDATE] = lappend(rfnodes[IDX_PUBACTION_UPDATE], rfnode);
+ }
+ if (pub->pubactions.pubdelete)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_DELETE] = lappend(rfnodes[IDX_PUBACTION_DELETE], rfnode);
+ }
MemoryContextSwitchTo(oldctx);
-
- ReleaseSysCache(rftuple);
}
else
{
- /*
- * If there is no row-filter, then any other row-filters for this table
- * also have no effect (because filters get OR-ed together) so we can
- * just discard anything found so far and exit early from the publications
- * loop.
- */
- if (rfnodes)
- {
- list_free_deep(rfnodes);
- rfnodes = NIL;
- }
- ReleaseSysCache(rftuple);
- break;
+ /* Remember which pubactions have no row-filter. */
+ if (pub->pubactions.pubinsert)
+ no_filter[IDX_PUBACTION_INSERT] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[IDX_PUBACTION_UPDATE] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[IDX_PUBACTION_DELETE] = true;
}
+ ReleaseSysCache(rftuple);
}
} /* loop all subscribed publications */
/*
- * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ * Now all the filters for all pubactions are known, let's try to combine them
+ * when their pubactions are same.
*/
- n_filters = list_length(rfnodes);
- if (n_filters > 0)
{
- Node *rfnode;
- TupleDesc tupdesc = RelationGetDescr(relation);
+ int idx;
+ bool found_filters = false;
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
- entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ /* For each pubaction... */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ int n_filters;
- /*
- * Create tuple table slots for row filter. Create a copy of the
- * TupleDesc as it needs to live as long as the cache remains.
- */
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- MemoryContextSwitchTo(oldctx);
+ /*
+ * If one or more publications with this pubaction had no filter at all,
+ * then that nullifies the effect of all other filters for the same
+ * pubaction (because filters get OR'ed together).
+ */
+ if (no_filter[idx])
+ {
+ if (rfnodes[idx])
+ {
+ list_free_deep(rfnodes[idx]);
+ rfnodes[idx] = NIL;
+ }
+ }
+
+ /*
+ * If there was one or more filter for this pubaction then combine them
+ * (if necessary) and cache the ExprState.
+ */
+ n_filters = list_length(rfnodes[idx]);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) : linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ found_filters = true; /* flag that we will need slots made */
+ }
+ } /* for each pubaction */
+
+ if (found_filters)
+ {
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ MemoryContextSwitchTo(oldctx);
+ }
}
entry->rowfilter_valid = true;
@@ -982,7 +1033,7 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+pgoutput_row_filter(int idx_pubaction, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
RelationSyncEntry *entry)
{
EState *estate;
@@ -991,7 +1042,7 @@ RelationSyncEntry *entry)
Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
- if (!entry->exprstate)
+ if (!entry->exprstate[idx_pubaction])
return true;
if (message_level_is_interesting(DEBUG3))
@@ -1016,12 +1067,12 @@ RelationSyncEntry *entry)
/*
* NOTE: Multiple publication row-filters have already been combined to a
- * single exprstate.
+ * single exprstate (for this pubaction).
*/
- if (entry->exprstate)
+ if (entry->exprstate[idx_pubaction])
{
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[idx_pubaction], ecxt);
}
/* Cleanup allocated resources */
@@ -1093,7 +1144,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
+ if (!pgoutput_row_filter(IDX_PUBACTION_INSERT, relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -1126,7 +1177,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ if (!pgoutput_row_filter_update_check(IDX_PUBACTION_UPDATE, relation, oldtuple, newtuple, relentry,
&modified_action))
break;
@@ -1180,7 +1231,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
+ if (!pgoutput_row_filter(IDX_PUBACTION_DELETE, relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1601,7 +1652,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->new_tuple = NULL;
entry->old_tuple = NULL;
entry->tmp_new_tuple = NULL;
- entry->exprstate = NULL;
+ entry->exprstate[IDX_PUBACTION_INSERT] = NULL;
+ entry->exprstate[IDX_PUBACTION_UPDATE] = NULL;
+ entry->exprstate[IDX_PUBACTION_DELETE] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1768,6 +1821,7 @@ static void
rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ int idx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1822,10 +1876,14 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
- if (entry->exprstate != NULL)
+ /* Cleanup the ExprState for each of the pubactions. */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
{
- pfree(entry->exprstate);
- entry->exprstate = NULL;
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
}
}
}
--
1.8.3.1
v45-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v45-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 1c52e41cc8821b2034824e72516bccd669a75ef2 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 05:32:44 -0500
Subject: [PATCH v45 3/5] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 38 ++++--
src/backend/replication/pgoutput/pgoutput.c | 194 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 212 insertions(+), 38 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..110ccff 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,13 +751,16 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
+ Datum attr_values[MaxTupleAttributeNumber];
+ bool attr_isnull[MaxTupleAttributeNumber];
desc = RelationGetDescr(rel);
@@ -771,7 +776,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (TupIsNull(slot))
+ {
+ values = attr_values;
+ isnull = attr_isnull;
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
@@ -832,6 +847,7 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
ReleaseSysCache(typtup);
}
+
}
/*
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3b85915..0ccffa7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -132,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +172,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +744,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -857,16 +961,34 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
*/
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -885,7 +1007,12 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -898,7 +1025,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -956,6 +1082,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -964,7 +1093,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -995,9 +1124,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1020,8 +1150,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1031,7 +1180,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1449,6 +1598,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v45-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v45-0002-PS-Row-filter-validation-walker.patchDownload
From e8352872c75bc917398dbd41c8e273a2c98bbc34 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 04:46:07 -0500
Subject: [PATCH v45 2/5] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifically:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr, NullTest
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
---
src/backend/catalog/pg_publication.c | 199 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 11 +-
src/backend/parser/parse_expr.c | 19 +--
src/backend/parser/parse_func.c | 3 +-
src/backend/parser/parse_oper.c | 7 --
src/test/regress/expected/publication.out | 146 +++++++++++++++++++---
src/test/regress/sql/publication.sql | 108 +++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 444 insertions(+), 57 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index dff1625..1630d26 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -249,10 +251,200 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+} rf_context;
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ * - IS NULL is allowed.
+ * - IS TRUE/FALSE is allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-definition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because those
+ * (not immutable ones) can access database and would lead to the problem (b)
+ * mentioned in the previous paragraph.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
*/
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ forbidden = _("user-defined types are not allowed");
+
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
+ || IsA(node, NullTest) || IsA(node, BooleanTest))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf(_("user-defined functions are not allowed: %s"),
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf(_("system functions that are not IMMUTABLE are not allowed: %s"),
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -363,6 +555,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..c95e14d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,10 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +950,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
+ /* okay (see function row_filter_walker) */
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..6d47bf8 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -1777,7 +1766,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
+ /* okay (see function row_filter_walker) */
break;
/*
@@ -3100,7 +3089,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..cfae0da 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,8 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
+ /* okay (see function row_filter_walker) */
+ pstate->p_hasTargetSRFs = true;
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1e78a04..a772975 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
@@ -308,26 +310,26 @@ Publications:
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (e < 999)
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
@@ -351,19 +353,43 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - IS TRUE/FALSE is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: user-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -385,6 +411,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 9bcd7d2..e8242c9 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -162,12 +164,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -181,13 +183,33 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - IS TRUE/FALSE is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -207,6 +229,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v45-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v45-0001-Row-filter-for-logical-replication.patchDownload
From d41e29ac5bcae8e15c67a538b40b0ae5e03e6488 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 03:55:06 -0500
Subject: [PATCH v45 1/5] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. The WHERE clause does not allow
user-defined functions / operators / types; it also does not allow built-in
functions unless they are immutable.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
f your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 28 ++-
doc/src/sgml/ref/create_subscription.sgml | 20 +-
src/backend/catalog/pg_publication.c | 62 ++++-
src/backend/commands/publicationcmds.c | 105 ++++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/parser/parse_relation.c | 9 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 344 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 26 +-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 146 ++++++++++++
src/test/regress/sql/publication.sql | 74 ++++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++++++
24 files changed, 1354 insertions(+), 50 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..5d9869c 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e..d950316 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,16 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>UPDATE</command> and/or <command>DELETE</command> operations must
+ contain only columns that are covered by <literal>REPLICA IDENTITY</literal>. The
+ row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>INSERT</command> can use any column. The <literal>WHERE</literal>
+ clause does not allow user-defined functions / operators / types; it also does not allow
+ built-in functions unless they are immutable.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -247,6 +261,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -260,6 +279,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..51f4a26 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ If the publications contain conditional expressions, it will affect
+ what data is copied. Refer to the
+ <xref linkend="sql-createsubscription-notes" /> for details.
+ </para>
</listitem>
</varlistentry>
@@ -293,7 +298,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</variablelist>
</refsect1>
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
<title>Notes</title>
<para>
@@ -319,6 +324,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored during data synchronization.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 65db07f..dff1625 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -283,22 +286,51 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
return result;
}
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *transformedwhereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ transformedwhereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (bfixupcollation)
+ assign_expr_collations(pstate, transformedwhereclause);
+
+ return transformedwhereclause;
+}
+
/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -319,10 +351,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /* Fix up collation information */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -336,6 +377,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -352,6 +399,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 404bb5d..f997867 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -497,6 +497,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ Node *oldrelwhereclause = NULL;
/*
* It is quite possible that for the SET case user has not specified any
@@ -529,40 +530,92 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
+
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +952,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +980,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1032,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1041,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1061,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1158,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee..be9c1fb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33b..c9ccbf3
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index c5c3f26..f66243e 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,11 +3538,20 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
+ {
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_TABLE),
+ errmsg("publication row-filter WHERE invalid reference to table \"%s\"",
+ relation->relname),
+ parser_errposition(pstate, relation->location)));
+
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
+ }
}
/*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..e7905ed 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free_deep(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..3b85915 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +156,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +165,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +647,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ ReleaseSysCache(rftuple);
+ }
+ else
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ ReleaseSysCache(rftuple);
+ break;
+ }
+
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +932,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +956,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +963,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +996,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1030,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1099,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1421,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1445,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1554,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1660,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1686,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1696,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1716,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..fb5cfc5 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,21 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
"WHERE pr.prrelid = '%s'\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3200,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ /* row filter (if any) */
+ if (pset.sversion >= 150000)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE %s", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6331,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE %s", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6465,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 902f2f2..96c55f6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -123,13 +125,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
extern bool is_publishable_relation(Relation rel);
extern bool is_schema_publication(Oid pubid);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index c096fbd..1e78a04 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,152 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub_dplus_rf_no"
+ "testpub_dplus_rf_yes" WHERE (a > 1)
+
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (e < 999)
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 0662882..9bcd7d2 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,80 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Thu, Sep 23, 2021 at 10:33 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Hi,
I finally had time to take a closer look at the patch again, so here's
some review comments. The thread is moving fast, so chances are some of
the comments are obsolete or were already raised in the past.
...
11) extra (unnecessary) parens in the deparsed expression
test=# alter publication p add table t where ((b < 100) and (c < 100));
ALTER PUBLICATION
test=# \dRp+ p
Publication p
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
-------+------------+---------+---------+---------+-----------+----------
user | f | t | t | t | t | f
Tables:
"public.t" WHERE (((b < 100) AND (c < 100)))
Euler's fix for this was integrated into v45 [1]/messages/by-id/CAFPTHDYB4nbxCMAFQGowJtDf7E6uBc==_HupBKy7MaMhM+9QQA@mail.gmail.com
------
[1]: /messages/by-id/CAFPTHDYB4nbxCMAFQGowJtDf7E6uBc==_HupBKy7MaMhM+9QQA@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Dec 8, 2021 7:52 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Tue, Dec 7, 2021 at 5:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
We were mid-way putting together the next v45* when your latest
attachment was posted over the weekend. So we will proceed with our
original plan to post our v45* (tomorrow).After v45* is posted we will pause to find what are all the
differences between your unified patch and our v45* patch set. Our
intention is to integrate as many improvements as possible from your
changes into the v46* etc that will follow tomorrow’s v45*. On some
points, we will most likely need further discussion.Posting an update for review comments, using contributions majorly from
Peter Smith.
I've also included changes based on Euler's combined patch, specially changes
to documentation and test cases.
I have left out Hou-san's 0005, in this patch-set. Hou-san will provide a rebased
update based on this.
Attach the Top up patch(as 0006) which do the replica identity validation when
actual UPDATE/DELETE happen. I adjusted the patch name to make the change
clearer.
The new version top up patch addressed all comments from Peter[1]/messages/by-id/CAHut+PuBdXGLw1+CBoNxXUp3bHcHcKYWHx1RSGF6tY5aSLu5ZA@mail.gmail.com and Greg[2]/messages/by-id/CAJcOf-dgxGmRs54nxQSZWDc0gaHZWFf3n+BhOChNXhi_cb8g9A@mail.gmail.com.
I also fixed a validation issue of the top up patch reported by Tang. The fix
is: If we add a partitioned table with filter and pubviaroot is true, we need
to validate the parent table's row filter when UPDATE the child table and we
should convert the parent table's column to the child's during validation in
case the column order of parent table is different from the child table.
[1]: /messages/by-id/CAHut+PuBdXGLw1+CBoNxXUp3bHcHcKYWHx1RSGF6tY5aSLu5ZA@mail.gmail.com
[2]: /messages/by-id/CAJcOf-dgxGmRs54nxQSZWDc0gaHZWFf3n+BhOChNXhi_cb8g9A@mail.gmail.com
Best regards,
Hou zj
Attachments:
v45-0005-Cache-ExprState-per-pubaction.patchapplication/octet-stream; name=v45-0005-Cache-ExprState-per-pubaction.patchDownload
From 8aff7339a51395bcfb3e8f52d9e4f94a648ed915 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 05:37:36 -0500
Subject: [PATCH v45 5/5] Cache ExprState per pubaction.
If a subscriber has multiple publications and these publications include the
same table then there can be multiple filters that apply to that table.
These filters are stored per-pubactions of the publications. There are 4 kinds
of pubaction ("insert", "update", "delete", "truncate"), but row-filters are
not applied for "truncate".
Filters for the same pubaction are all combined (OR'ed) and cached as one, so
at the end there are at most 3 cached filters per table.
The appropriate (pubaction) filter is executed according to the DML operation.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1%2BhVXfOSScbf5LUB%3D5is%3DwYaC6NBhLxuvetbWQnZRnsVQ%40mail.gmail.com
---
src/backend/replication/pgoutput/pgoutput.c | 176 ++++++++++++++++++----------
1 file changed, 117 insertions(+), 59 deletions(-)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 0ccffa7..c244961 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -130,10 +130,16 @@ typedef struct RelationSyncEntry
* The flag 'rowfilter_valid' indicates if the exprstate has been assigned
* yet or not. We cannot just use the exprstate value for this purpose
* because there might be no filter at all for the current relid (e.g.
- * exprstate is NULL).
+ * every exprstate is NULL).
+ * The row-filter exprstate is stored per pubaction type (row-filters are
+ * not applied for "truncate" pubaction).
*/
bool rowfilter_valid;
- ExprState *exprstate; /* ExprState for row filter(s) */
+#define IDX_PUBACTION_INSERT 0
+#define IDX_PUBACTION_UPDATE 1
+#define IDX_PUBACTION_DELETE 2
+#define IDX_PUBACTION_n 3
+ ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState for row filter(s). One per pubaction. */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
@@ -175,10 +181,10 @@ static EState *create_estate_for_relation(Relation rel);
static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter(int idx_pubaction, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, TupleTableSlot *slot,
RelationSyncEntry *entry);
-static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+static bool pgoutput_row_filter_update_check(int idx_pubaction, Relation relation, HeapTuple oldtuple,
HeapTuple newtuple, RelationSyncEntry *entry,
ReorderBufferChangeType *action);
@@ -755,7 +761,7 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
* If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+pgoutput_row_filter_update_check(int idx_pubaction, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
TupleDesc desc = RelationGetDescr(relation);
int i;
@@ -763,7 +769,7 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
/* Bail out if there is no row filter */
- if (!entry->exprstate)
+ if (!entry->exprstate[idx_pubaction])
return true;
/* update requires a new tuple */
@@ -780,7 +786,7 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
if (!oldtuple)
{
*action = REORDER_BUFFER_CHANGE_UPDATE;
- return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ return pgoutput_row_filter(idx_pubaction, relation, NULL, newtuple, NULL, entry);
}
old_slot = entry->old_tuple;
@@ -827,8 +833,8 @@ pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTupl
}
- old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
- new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+ old_matched = pgoutput_row_filter(idx_pubaction, relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(idx_pubaction, relation, NULL, NULL, tmp_new_slot, entry);
if (!old_matched && !new_matched)
return false;
@@ -850,8 +856,8 @@ static void
pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
ListCell *lc;
- List *rfnodes = NIL;
- int n_filters;
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
/*
* If the row filter caching is currently flagged "invalid" then it means we
@@ -907,7 +913,7 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
bool rfisnull;
/*
- * Lookup if there is a row-filter, and if yes remember it in a list.
+ * Lookup if there is a row-filter, and if yes remember it in a list (per pubaction).
* In code following this 'publications' loop we will combine all filters.
*/
rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
@@ -920,56 +926,101 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
Node *rfnode;
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = stringToNode(TextDatumGetCString(rfdatum));
- rfnodes = lappend(rfnodes, rfnode);
+ /* Gather the rfnodes per pubaction of this publiaction. */
+ if (pub->pubactions.pubinsert)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_INSERT] = lappend(rfnodes[IDX_PUBACTION_INSERT], rfnode);
+ }
+ if (pub->pubactions.pubupdate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_UPDATE] = lappend(rfnodes[IDX_PUBACTION_UPDATE], rfnode);
+ }
+ if (pub->pubactions.pubdelete)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[IDX_PUBACTION_DELETE] = lappend(rfnodes[IDX_PUBACTION_DELETE], rfnode);
+ }
MemoryContextSwitchTo(oldctx);
-
- ReleaseSysCache(rftuple);
}
else
{
- /*
- * If there is no row-filter, then any other row-filters for this table
- * also have no effect (because filters get OR-ed together) so we can
- * just discard anything found so far and exit early from the publications
- * loop.
- */
- if (rfnodes)
- {
- list_free_deep(rfnodes);
- rfnodes = NIL;
- }
- ReleaseSysCache(rftuple);
- break;
+ /* Remember which pubactions have no row-filter. */
+ if (pub->pubactions.pubinsert)
+ no_filter[IDX_PUBACTION_INSERT] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[IDX_PUBACTION_UPDATE] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[IDX_PUBACTION_DELETE] = true;
}
+ ReleaseSysCache(rftuple);
}
} /* loop all subscribed publications */
/*
- * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ * Now all the filters for all pubactions are known, let's try to combine them
+ * when their pubactions are same.
*/
- n_filters = list_length(rfnodes);
- if (n_filters > 0)
{
- Node *rfnode;
- TupleDesc tupdesc = RelationGetDescr(relation);
+ int idx;
+ bool found_filters = false;
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
- entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+ /* For each pubaction... */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ int n_filters;
- /*
- * Create tuple table slots for row filter. Create a copy of the
- * TupleDesc as it needs to live as long as the cache remains.
- */
- tupdesc = CreateTupleDescCopy(tupdesc);
- entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
- entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
- MemoryContextSwitchTo(oldctx);
+ /*
+ * If one or more publications with this pubaction had no filter at all,
+ * then that nullifies the effect of all other filters for the same
+ * pubaction (because filters get OR'ed together).
+ */
+ if (no_filter[idx])
+ {
+ if (rfnodes[idx])
+ {
+ list_free_deep(rfnodes[idx]);
+ rfnodes[idx] = NIL;
+ }
+ }
+
+ /*
+ * If there was one or more filter for this pubaction then combine them
+ * (if necessary) and cache the ExprState.
+ */
+ n_filters = list_length(rfnodes[idx]);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) : linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ found_filters = true; /* flag that we will need slots made */
+ }
+ } /* for each pubaction */
+
+ if (found_filters)
+ {
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ MemoryContextSwitchTo(oldctx);
+ }
}
entry->rowfilter_valid = true;
@@ -982,7 +1033,7 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
* If it returns true, the change is replicated, otherwise, it is not.
*/
static bool
-pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+pgoutput_row_filter(int idx_pubaction, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
RelationSyncEntry *entry)
{
EState *estate;
@@ -991,7 +1042,7 @@ RelationSyncEntry *entry)
Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
- if (!entry->exprstate)
+ if (!entry->exprstate[idx_pubaction])
return true;
if (message_level_is_interesting(DEBUG3))
@@ -1016,12 +1067,12 @@ RelationSyncEntry *entry)
/*
* NOTE: Multiple publication row-filters have already been combined to a
- * single exprstate.
+ * single exprstate (for this pubaction).
*/
- if (entry->exprstate)
+ if (entry->exprstate[idx_pubaction])
{
/* Evaluates row filter */
- result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[idx_pubaction], ecxt);
}
/* Cleanup allocated resources */
@@ -1093,7 +1144,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
+ if (!pgoutput_row_filter(IDX_PUBACTION_INSERT, relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -1126,7 +1177,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ if (!pgoutput_row_filter_update_check(IDX_PUBACTION_UPDATE, relation, oldtuple, newtuple, relentry,
&modified_action))
break;
@@ -1180,7 +1231,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
+ if (!pgoutput_row_filter(IDX_PUBACTION_DELETE, relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1601,7 +1652,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->new_tuple = NULL;
entry->old_tuple = NULL;
entry->tmp_new_tuple = NULL;
- entry->exprstate = NULL;
+ entry->exprstate[IDX_PUBACTION_INSERT] = NULL;
+ entry->exprstate[IDX_PUBACTION_UPDATE] = NULL;
+ entry->exprstate[IDX_PUBACTION_DELETE] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1768,6 +1821,7 @@ static void
rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ int idx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1822,10 +1876,14 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
ExecDropSingleTupleTableSlot(entry->scantuple);
entry->scantuple = NULL;
}
- if (entry->exprstate != NULL)
+ /* Cleanup the ExprState for each of the pubactions. */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
{
- pfree(entry->exprstate);
- entry->exprstate = NULL;
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
}
}
}
--
1.8.3.1
v45-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v45-0001-Row-filter-for-logical-replication.patchDownload
From d41e29ac5bcae8e15c67a538b40b0ae5e03e6488 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 03:55:06 -0500
Subject: [PATCH v45 1/5] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. The WHERE clause does not allow
user-defined functions / operators / types; it also does not allow built-in
functions unless they are immutable.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expression will be copied. If subscriber is a
pre-15 version, data synchronization won't use row filters if they are defined
in the publisher.
Previous versions cannot handle row filters.
f your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 28 ++-
doc/src/sgml/ref/create_subscription.sgml | 20 +-
src/backend/catalog/pg_publication.c | 62 ++++-
src/backend/commands/publicationcmds.c | 105 ++++++--
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/parser/parse_relation.c | 9 +
src/backend/replication/logical/tablesync.c | 116 ++++++++-
src/backend/replication/pgoutput/pgoutput.c | 344 ++++++++++++++++++++++++++-
src/bin/psql/describe.c | 26 +-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 146 ++++++++++++
src/test/regress/sql/publication.sql | 74 ++++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++++++
24 files changed, 1354 insertions(+), 50 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be..af6b1f6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6299,6 +6299,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..5d9869c 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e..d950316 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -76,6 +76,10 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
This does not apply to a partitioned table, however. The partitions of
a partitioned table are always implicitly considered part of the
publication, so they are never explicitly added to the publication.
+ If the optional <literal>WHERE</literal> clause is specified, rows that do
+ not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
</para>
<para>
@@ -226,6 +230,16 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ The row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>UPDATE</command> and/or <command>DELETE</command> operations must
+ contain only columns that are covered by <literal>REPLICA IDENTITY</literal>. The
+ row-filter <literal>WHERE</literal> clause for a table added to a publication that
+ publishes <command>INSERT</command> can use any column. The <literal>WHERE</literal>
+ clause does not allow user-defined functions / operators / types; it also does not allow
+ built-in functions unless they are immutable.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -247,6 +261,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -260,6 +279,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..51f4a26 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ If the publications contain conditional expressions, it will affect
+ what data is copied. Refer to the
+ <xref linkend="sql-createsubscription-notes" /> for details.
+ </para>
</listitem>
</varlistentry>
@@ -293,7 +298,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</variablelist>
</refsect1>
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
<title>Notes</title>
<para>
@@ -319,6 +324,19 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be replicated. If the subscription has several publications in
+ which the same table has been published with different filters, those
+ expressions get OR'ed together so that rows satisfying any of the expressions
+ will be replicated. Notice this means if one of the publications has no filter
+ at all then all other filters become redundant. If the subscriber is a
+ <productname>PostgreSQL</productname> version before 15 then any row filtering
+ is ignored during data synchronization.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 65db07f..dff1625 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -283,22 +286,51 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
return result;
}
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *transformedwhereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ transformedwhereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (bfixupcollation)
+ assign_expr_collations(pstate, transformedwhereclause);
+
+ return transformedwhereclause;
+}
+
/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -319,10 +351,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /* Fix up collation information */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -336,6 +377,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -352,6 +399,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 404bb5d..f997867 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -497,6 +497,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *rels = NIL;
Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
Oid pubid = pubform->oid;
+ Node *oldrelwhereclause = NULL;
/*
* It is quite possible that for the SET case user has not specified any
@@ -529,40 +530,92 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
+
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
if (!found)
{
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
-
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
-
- delrels = lappend(delrels, pubrel);
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +952,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +980,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1032,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1041,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1061,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1158,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 297b6ee..be9c1fb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4832,6 +4832,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index f537d3e..5776447 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 86ce33b..c9ccbf3
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9654,12 +9654,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9674,28 +9675,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17343,7 +17361,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17356,6 +17375,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index c5c3f26..f66243e 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,11 +3538,20 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
+ {
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_TABLE),
+ errmsg("publication row-filter WHERE invalid reference to table \"%s\"",
+ relation->relname),
+ parser_errposition(pstate, relation->location)));
+
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
+ }
}
/*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..e7905ed 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row-filter expressions for the same table will later be
+ * combined by the COPY using OR, but this means if any of the filters is
+ * null, then effectively none of the other filters is meaningful. So this
+ * loop is also checking for null filters and can exit early if any are
+ * encountered.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ if (isnull)
+ {
+ /*
+ * A single null filter nullifies the effect of any other filter for this
+ * table.
+ */
+ if (*qual)
+ {
+ list_free_deep(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..3b85915 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,17 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * Row-filter related members:
+ * The flag 'rowfilter_valid' indicates if the exprstate has been assigned
+ * yet or not. We cannot just use the exprstate value for this purpose
+ * because there might be no filter at all for the current relid (e.g.
+ * exprstate is NULL).
+ */
+ bool rowfilter_valid;
+ ExprState *exprstate; /* ExprState for row filter(s) */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +156,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +165,13 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +647,265 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes = NIL;
+ int n_filters;
+
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->rowfilter_valid)
+ {
+ MemoryContext oldctx;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple row-filters for the same table are combined by OR-ing
+ * them together, but this means that if (in any of the publications)
+ * there is *no* filter then effectively none of the other filters have
+ * any meaning either.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list.
+ * In code following this 'publications' loop we will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes = lappend(rfnodes, rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ ReleaseSysCache(rftuple);
+ }
+ else
+ {
+ /*
+ * If there is no row-filter, then any other row-filters for this table
+ * also have no effect (because filters get OR-ed together) so we can
+ * just discard anything found so far and exit early from the publications
+ * loop.
+ */
+ if (rfnodes)
+ {
+ list_free_deep(rfnodes);
+ rfnodes = NIL;
+ }
+ ReleaseSysCache(rftuple);
+ break;
+ }
+
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Combine using all the row-filters (if any) into a single filter, and then build the ExprState for it
+ */
+ n_filters = list_length(rfnodes);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes);
+ entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
+
+ /*
+ * Create a tuple table slot for row filter. TupleDesc must live as
+ * long as the cache remains.
+ */
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->rowfilter_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate.
+ */
+ if (entry->exprstate)
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate, ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +932,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +956,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +963,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +996,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1030,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1099,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1421,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1445,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->rowfilter_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1554,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1354,6 +1660,21 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->rowfilter_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ if (entry->exprstate != NULL)
+ {
+ pfree(entry->exprstate);
+ entry->exprstate = NULL;
+ }
}
}
@@ -1365,6 +1686,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1696,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1716,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..fb5cfc5 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,21 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
"WHERE pr.prrelid = '%s'\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3200,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ /* row filter (if any) */
+ if (pset.sversion >= 150000)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE %s", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6331,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE %s", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6465,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 902f2f2..96c55f6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -123,13 +125,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
extern bool is_publishable_relation(Relation rel);
extern bool is_schema_publication(Oid pubid);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 067138e..5d58a9f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3641,6 +3641,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index c096fbd..1e78a04 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,152 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub_dplus_rf_no"
+ "testpub_dplus_rf_yes" WHERE (a > 1)
+
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (e < 999)
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 0662882..9bcd7d2 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,80 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v45-0002-PS-Row-filter-validation-walker.patchapplication/octet-stream; name=v45-0002-PS-Row-filter-validation-walker.patchDownload
From e8352872c75bc917398dbd41c8e273a2c98bbc34 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 04:46:07 -0500
Subject: [PATCH v45 2/5] PS - Row filter validation walker
This patch implements a parse-tree "walker" to validate a row-filter expression.
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" and "update" it validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Discussion: https://www.postgresql.org/message-id/CAA4eK1Kyax-qnVPcXzODu3JmA4vtgAjUSYPUK1Pm3vBL5gC81g%40mail.gmail.com
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifically:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system functions (unless they are IMMUTABLE). See design decision at [1].
- no parse nodes of any kind other than Var, OpExpr, Const, BoolExpr, FuncExpr, NullIfExpr, NullTest
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
---
src/backend/catalog/pg_publication.c | 199 +++++++++++++++++++++++++++++-
src/backend/parser/parse_agg.c | 11 +-
src/backend/parser/parse_expr.c | 19 +--
src/backend/parser/parse_func.c | 3 +-
src/backend/parser/parse_oper.c | 7 --
src/test/regress/expected/publication.out | 146 +++++++++++++++++++---
src/test/regress/sql/publication.sql | 108 +++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 444 insertions(+), 57 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index dff1625..1630d26 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -249,10 +251,200 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
+/* For rowfilter_walker. */
+typedef struct {
+ Relation rel;
+ bool check_replident; /* check if Var is bms_replident member? */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+} rf_context;
+
/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
+ * The row filter walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - System functions that are not IMMUTABLE are not allowed.
+ * - NULLIF is allowed.
+ * - IS NULL is allowed.
+ * - IS TRUE/FALSE is allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-definition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because those
+ * (not immutable ones) can access database and would lead to the problem (b)
+ * mentioned in the previous paragraph.
+ *
+ * Rules: Replica Identity validation
+ * -----------------------------------
+ * If the flag context.check_replident is true then validate that every variable
+ * referenced by the filter expression is a valid member of the allowed set of
+ * replica identity columns (context.bms_replindent)
*/
+static bool
+rowfilter_walker(Node *node, rf_context *context)
+{
+ char *forbidden = NULL;
+ bool too_complex = false;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ forbidden = _("user-defined types are not allowed");
+
+ /* Optionally, do replica identify validation of the referenced column. */
+ if (context->check_replident)
+ {
+ Oid relid = RelationGetRelid(context->rel);
+ AttrNumber attnum = var->varattno;
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
+ {
+ const char *colname = get_attname(relid, attnum, false);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(context->rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+ }
+ }
+ else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
+ || IsA(node, NullTest) || IsA(node, BooleanTest))
+ {
+ /* OK */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *)node)->opno >= FirstNormalObjectId)
+ forbidden = _("user-defined operators are not allowed");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *)node)->funcid;
+ char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ {
+ forbidden = psprintf(_("user-defined functions are not allowed: %s"),
+ funcname);
+ }
+ else
+ {
+ if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ forbidden = psprintf(_("system functions that are not IMMUTABLE are not allowed: %s"),
+ funcname);
+ }
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+ too_complex = true;
+ }
+
+ if (too_complex)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errhint("only simple expressions using columns, constants and immutable system functions are allowed")
+ ));
+
+ if (forbidden)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(context->rel)),
+ errdetail("%s", forbidden)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)context);
+}
+
+/*
+ * Check if the row-filter is valid according to the following rules:
+ *
+ * 1. Only certain simple node types are permitted in the expression. See
+ * function rowfilter_walker for details.
+ *
+ * 2. If the publish operation contains "delete" or "update" then only columns
+ * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
+ * the row-filter WHERE clause.
+ */
+static void
+rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
+{
+ rf_context context = {0};
+
+ context.rel = rel;
+
+ /*
+ * For "delete" or "update", check that filter cols are also valid replica
+ * identity cols.
+ */
+ if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
+ {
+ char replica_identity = rel->rd_rel->relreplident;
+
+ if (replica_identity == REPLICA_IDENTITY_FULL)
+ {
+ /*
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter too.
+ */
+ }
+ else
+ {
+ context.check_replident = true;
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
+ */
+ if (replica_identity == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else
+ context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+ }
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and throw an
+ * error if anything not permitted or unexpected is encountered.
+ */
+ rowfilter_walker(rfnode, &context);
+
+ bms_free(context.bms_replident);
+}
+
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -363,6 +555,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /* Validate the row-filter. */
+ rowfilter_expr_checker(pub, targetrel, whereclause);
}
/* Form a tuple. */
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..c95e14d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,10 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
+ /*
+ * OK for now. The row-filter validation is done later by a walker
+ * function (see pg_publication).
+ */
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +950,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
+ /* okay (see function row_filter_walker) */
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..6d47bf8 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -1777,7 +1766,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
+ /* okay (see function row_filter_walker) */
break;
/*
@@ -3100,7 +3089,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..cfae0da 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,8 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
+ /* okay (see function row_filter_walker) */
+ pstate->p_hasTargetSRFs = true;
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1e78a04..a772975 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -248,13 +248,15 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -264,7 +266,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -275,7 +277,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
@@ -286,7 +288,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
@@ -308,26 +310,26 @@ Publications:
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (e < 999)
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
@@ -351,19 +353,43 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: user-defined functions are not allowed: testpub_rf_func99
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: system functions that are not IMMUTABLE are not allowed: random
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - IS TRUE/FALSE is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: user-defined operators are not allowed
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: user-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+HINT: only simple expressions using columns, constants and immutable system functions are allowed
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
@@ -385,6 +411,92 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 9bcd7d2..e8242c9 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -143,7 +143,9 @@ CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
CREATE SCHEMA testpub_rf_myschema1;
CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -162,12 +164,12 @@ RESET client_min_messages;
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
@@ -181,13 +183,33 @@ SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func99() RETURNS integer AS $$ BEGIN RETURN 99; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < testpub_rf_func99());
+-- fail - system functions that are not IMMUTABLE are not allowed; random() is a "volatile" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - system functions that are IMMUTABLE are allowed; int8inc() is an "immutable" function.
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a < int8inc(999));
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - IS NULL is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+-- ok - IS TRUE/FALSE is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
-- fail - user-defined operators disallowed
CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3;
ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
@@ -207,6 +229,82 @@ DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func99();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+-- ok - "a" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- ok - "b" is a PK col
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
+RESET client_min_messages;
+DROP PUBLICATION testpub6;
+-- fail - "c" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "d" is not part of the PK
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+-- fail - "a" is not part of REPLICA IDENTITY
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- ok - "a" is in REPLICA IDENTITY now
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+DROP PUBLICATION testpub6;
+RESET client_min_messages;
+
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v45-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchapplication/octet-stream; name=v45-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patchDownload
From 1c52e41cc8821b2034824e72516bccd669a75ef2 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 05:32:44 -0500
Subject: [PATCH v45 3/5] Support updates based on old and new tuple in row
filters.
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 38 ++++--
src/backend/replication/pgoutput/pgoutput.c | 194 +++++++++++++++++++++++++---
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 212 insertions(+), 38 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..110ccff 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,13 +751,16 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
+ Datum attr_values[MaxTupleAttributeNumber];
+ bool attr_isnull[MaxTupleAttributeNumber];
desc = RelationGetDescr(rel);
@@ -771,7 +776,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (TupIsNull(slot))
+ {
+ values = attr_values;
+ isnull = attr_isnull;
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
@@ -832,6 +847,7 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
ReleaseSysCache(typtup);
}
+
}
/*
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3b85915..0ccffa7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -132,7 +134,10 @@ typedef struct RelationSyncEntry
*/
bool rowfilter_valid;
ExprState *exprstate; /* ExprState for row filter(s) */
- TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -167,10 +172,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry,
+ ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -734,18 +744,112 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, RelationSyncEntry *entry, ReorderBufferChangeType *action)
+{
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate)
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+ ExecClearTuple(old_slot);
+ ExecClearTuple(new_slot);
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
{
- EState *estate;
- ExprContext *ecxt;
ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
List *rfnodes = NIL;
int n_filters;
@@ -857,16 +961,34 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
entry->exprstate = pgoutput_row_filter_init_expr(rfnode);
/*
- * Create a tuple table slot for row filter. TupleDesc must live as
- * long as the cache remains.
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
*/
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->rowfilter_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(Relation relation, HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
/* Bail out if there is no row filter */
if (!entry->exprstate)
@@ -885,7 +1007,12 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -898,7 +1025,6 @@ pgoutput_row_filter(PGOutputData *data, Relation relation, HeapTuple oldtuple, H
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -956,6 +1082,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -964,7 +1093,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple tuple = &change->data.tp.newtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, NULL, tuple, relentry))
+ if (!pgoutput_row_filter(relation, NULL, tuple, NULL, relentry))
break;
/*
@@ -995,9 +1124,10 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
- /* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, newtuple, relentry))
+ if (!pgoutput_row_filter_update_check(relation, oldtuple, newtuple, relentry,
+ &modified_action))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1020,8 +1150,27 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1031,7 +1180,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
/* Check row filter. */
- if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ if (!pgoutput_row_filter(relation, oldtuple, NULL, NULL, relentry))
break;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1449,6 +1598,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v45-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchapplication/octet-stream; name=v45-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patchDownload
From 8dce8d67068315d2075abba1bf590faa658116fe Mon Sep 17 00:00:00 2001
From: Ajin Cherian <ajinc@fast.au.fujitsu.com>
Date: Wed, 8 Dec 2021 05:34:59 -0500
Subject: [PATCH v45 4/5] Tab auto-complete and pgdump support for Row Filter.
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 15 +++++++++++++--
3 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 10a86f9..e595c7f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4265,6 +4265,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4275,9 +4276,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4286,6 +4294,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4326,6 +4335,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4393,8 +4406,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 6dccb4b..74f82cd 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -633,6 +633,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca..c1591f4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,14 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /* "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with table attributes */
+ /* "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,11 +2785,14 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
+ /* "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with table attributes */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
/*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
--
1.8.3.1
v45-0006-do-replica-identity-validation-when-UPDATE-or-DELETE.patchapplication/octet-stream; name=v45-0006-do-replica-identity-validation-when-UPDATE-or-DELETE.patchDownload
From 4a370f0d6586091979512a11797e8cb0b05affce Mon Sep 17 00:00:00 2001
From: "houzj.fnst" <houzj.fnst@fujitsu.com>
Date: Thu, 9 Dec 2021 09:13:50 +0800
Subject: [PATCH] do REPLICA IDENTITY validation when UPDATE or DELETE
For publish mode "delete" "update", validates that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity and can detect the change related to the row filter in time.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.
---
src/backend/catalog/pg_publication.c | 104 +--------
src/backend/executor/execReplication.c | 36 ++-
src/backend/utils/cache/relcache.c | 262 +++++++++++++++++++---
src/include/utils/rel.h | 7 +
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 94 +++++---
src/test/regress/sql/publication.sql | 79 ++++---
7 files changed, 388 insertions(+), 195 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 1630d2650c..3decb3935c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -251,13 +251,6 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/* For rowfilter_walker. */
-typedef struct {
- Relation rel;
- bool check_replident; /* check if Var is bms_replident member? */
- Bitmapset *bms_replident; /* bitset of replica identity col indexes */
-} rf_context;
-
/*
* The row filter walker checks that the row filter expression is legal.
*
@@ -291,15 +284,9 @@ typedef struct {
* We don't allow anything other than immutable built-in functions because those
* (not immutable ones) can access database and would lead to the problem (b)
* mentioned in the previous paragraph.
- *
- * Rules: Replica Identity validation
- * -----------------------------------
- * If the flag context.check_replident is true then validate that every variable
- * referenced by the filter expression is a valid member of the allowed set of
- * replica identity columns (context.bms_replindent)
*/
static bool
-rowfilter_walker(Node *node, rf_context *context)
+rowfilter_walker(Node *node, Relation relation)
{
char *forbidden = NULL;
bool too_complex = false;
@@ -314,25 +301,6 @@ rowfilter_walker(Node *node, rf_context *context)
/* User-defined types not allowed. */
if (var->vartype >= FirstNormalObjectId)
forbidden = _("user-defined types are not allowed");
-
- /* Optionally, do replica identify validation of the referenced column. */
- if (context->check_replident)
- {
- Oid relid = RelationGetRelid(context->rel);
- AttrNumber attnum = var->varattno;
-
- if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, context->bms_replident))
- {
- const char *colname = get_attname(relid, attnum, false);
-
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot add relation \"%s\" to publication",
- RelationGetRelationName(context->rel)),
- errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
- colname)));
- }
- }
}
else if (IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr)
|| IsA(node, NullTest) || IsA(node, BooleanTest))
@@ -375,74 +343,18 @@ rowfilter_walker(Node *node, rf_context *context)
if (too_complex)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errhint("only simple expressions using columns, constants and immutable system functions are allowed")
));
if (forbidden)
ereport(ERROR,
(errmsg("invalid publication WHERE expression for relation \"%s\"",
- RelationGetRelationName(context->rel)),
+ RelationGetRelationName(relation)),
errdetail("%s", forbidden)
));
- return expression_tree_walker(node, rowfilter_walker, (void *)context);
-}
-
-/*
- * Check if the row-filter is valid according to the following rules:
- *
- * 1. Only certain simple node types are permitted in the expression. See
- * function rowfilter_walker for details.
- *
- * 2. If the publish operation contains "delete" or "update" then only columns
- * that are allowed by the REPLICA IDENTITY rules are permitted to be used in
- * the row-filter WHERE clause.
- */
-static void
-rowfilter_expr_checker(Publication *pub, Relation rel, Node *rfnode)
-{
- rf_context context = {0};
-
- context.rel = rel;
-
- /*
- * For "delete" or "update", check that filter cols are also valid replica
- * identity cols.
- */
- if (pub->pubactions.pubdelete || pub->pubactions.pubupdate)
- {
- char replica_identity = rel->rd_rel->relreplident;
-
- if (replica_identity == REPLICA_IDENTITY_FULL)
- {
- /*
- * FULL means all cols are in the REPLICA IDENTITY, so all cols are
- * allowed in the row-filter too.
- */
- }
- else
- {
- context.check_replident = true;
-
- /*
- * Find what are the cols that are part of the REPLICA IDENTITY.
- * Note that REPLICA IDENTIY DEFAULT means primary key or nothing.
- */
- if (replica_identity == REPLICA_IDENTITY_DEFAULT)
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
- else
- context.bms_replident = RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_IDENTITY_KEY);
- }
- }
-
- /*
- * Walk the parse-tree of this publication row filter expression and throw an
- * error if anything not permitted or unexpected is encountered.
- */
- rowfilter_walker(rfnode, &context);
-
- bms_free(context.bms_replident);
+ return expression_tree_walker(node, rowfilter_walker, (void *) relation);
}
List *
@@ -556,8 +468,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
- /* Validate the row-filter. */
- rowfilter_expr_checker(pub, targetrel, whereclause);
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d27fd..c069863d56 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -568,14 +568,46 @@ void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
PublicationActions *pubactions;
+ AttrNumber invalid_rfcolnum;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ return;
+
+ invalid_rfcolnum = RelationGetInvalidRowFilterCol(rel);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns referenced in
+ * the row filters from publications which the relation is in are valid,
+ * which means all referenced columns are part of REPLICA IDENTITY, or the
+ * table do not publish UPDATES or DELETES.
+ */
+ if (invalid_rfcolnum)
+ {
+ const char *colname = get_attname(RelationGetRelid(rel),
+ invalid_rfcolnum, false);
+
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Row filter column \"%s\" is not part of the REPLICA IDENTITY",
+ colname)));
+ }
+
/* If relation has replica identity we are always good. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
- OidIsValid(RelationGetReplicaIndex(rel)))
+ if (OidIsValid(RelationGetReplicaIndex(rel)))
return;
/*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 105d8d4601..d0c1b3d316 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -71,6 +72,8 @@
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_relation.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rowsecurity.h"
#include "storage/lmgr.h"
@@ -84,6 +87,7 @@
#include "utils/memutils.h"
#include "utils/relmapper.h"
#include "utils/resowner_private.h"
+#include "utils/ruleutils.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
@@ -5521,57 +5525,169 @@ RelationGetExclusionInfo(Relation indexRelation,
MemoryContextSwitchTo(oldcxt);
}
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcolnum; /* invalid column number */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+ bool pubviaroot; /* true if we are validating the parent
+ * relation's row filter */
+ Oid relid; /* relid of the relation */
+ Oid parentid; /* relid of the parent relation */
+} rf_context;
+
/*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row-filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcolnum.
*/
-struct PublicationActions *
-GetRelationPublicationActions(Relation relation)
+static bool
+invalid_rowfilter_column_walker(Node *node, rf_context *context)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ /*
+ * If pubviaroot is true, we need to convert the column number of
+ * parent to the column number of child relation first.
+ */
+ if (context->pubviaroot)
+ {
+ char *colname = get_attname(context->parentid, attnum, false);
+ attnum = get_attnum(context->relid, colname);
+ }
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ context->bms_replident))
+ {
+ context->invalid_rfcolnum = attnum;
+ return true;
+ }
+ }
+
+ return expression_tree_walker(node, invalid_rowfilter_column_walker,
+ (void *) context);
+}
+
+/*
+ * Append to cur_puboids each member of add_puboids that isn't already in
+ * cur_puboids.
+ *
+ * Also update the top most parent relation's relid in the publication.
+ */
+static void
+concat_publication_oid(Oid relid,
+ List **cur_puboids,
+ List **toprelid_in_pub,
+ const List *add_puboids)
+{
+ ListCell *lc1,
+ *lc2,
+ *lc3;
+
+ foreach(lc1, add_puboids)
+ {
+ bool is_member = false;
+
+ forboth(lc2, *cur_puboids, lc3, *toprelid_in_pub)
+ {
+ if (lfirst_oid(lc2) == lfirst_oid(lc1))
+ {
+ is_member = true;
+ lfirst_oid(lc3) = relid;
+ }
+ }
+
+ if (!is_member)
+ {
+ *cur_puboids = lappend_oid(*cur_puboids, lfirst_oid(lc1));
+ *toprelid_in_pub = lappend_oid(*toprelid_in_pub, relid);
+ }
+ }
+}
+
+/*
+ * Get the invalid row filter column number for the given relation.
+ *
+ * Traverse all the publications which the relation is in to get the
+ * publication actions. If the publication actions include UPDATE or DELETE,
+ * then validate that if all columns referenced in the row filter expression
+ * are part of REPLICA IDENTITY.
+ *
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, otherwise InvalidAttrNumber.
+ */
+AttrNumber
+RelationGetInvalidRowFilterCol(Relation relation)
+{
+ List *puboids,
+ *toprelid_in_pub;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ Oid relid = RelationGetRelid(relation);
+ rf_context context = { 0 };
+ PublicationActions pubactions = { 0 };
+ bool rfcol_valid = true;
+ AttrNumber invalid_rfcolnum = InvalidAttrNumber;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
- if (!is_publishable_relation(relation))
- return pubactions;
-
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (!is_publishable_relation(relation) || relation->rd_rfcol_valid)
+ return invalid_rfcolnum;
/* Fetch the publication membership info. */
- puboids = GetRelationPublications(RelationGetRelid(relation));
+ toprelid_in_pub = puboids = NIL;
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetRelationPublications(relid));
schemaid = RelationGetNamespace(relation);
- puboids = list_concat_unique_oid(puboids, GetSchemaPublications(schemaid));
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
if (relation->rd_rel->relispartition)
{
/* Add publications that the ancestors are in too. */
- List *ancestors = get_partition_ancestors(RelationGetRelid(relation));
+ List *ancestors = get_partition_ancestors(relid);
ListCell *lc;
foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
- puboids = list_concat_unique_oid(puboids,
- GetRelationPublications(ancestor));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetRelationPublications(ancestor));
schemaid = get_rel_namespace(ancestor);
- puboids = list_concat_unique_oid(puboids,
- GetSchemaPublications(schemaid));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
}
+
+ relid = llast_oid(ancestors);
}
- puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetAllTablesPublications());
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTITY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5581,35 +5697,116 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubactions.pubinsert |= pubform->pubinsert;
+ pubactions.pubupdate |= pubform->pubupdate;
+ pubactions.pubdelete |= pubform->pubdelete;
+ pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UPDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ *
+ * If we already found the column in row filter which is not part
+ * of REPLICA IDENTITY index, skip the validation too.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL &&
+ rfcol_valid)
+ {
+ HeapTuple rftuple;
+
+ if (pubform->pubviaroot)
+ relid = list_nth_oid(toprelid_in_pub,
+ foreach_current_index(lc));
+ else
+ relid = RelationGetRelid(relation);
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ context.pubviaroot = pubform->pubviaroot;
+ context.parentid = relid;
+ context.relid = RelationGetRelid(relation);
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !invalid_rowfilter_column_walker(rfnode,
+ &context);
+ invalid_rfcolnum = context.invalid_rfcolnum;
+ pfree(rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
+ */
+ if (pubactions.pubinsert && pubactions.pubupdate &&
+ pubactions.pubdelete && pubactions.pubtruncate &&
+ !rfcol_valid)
break;
}
+ bms_free(context.bms_replident);
+
if (relation->rd_pubactions)
{
pfree(relation->rd_pubactions);
relation->rd_pubactions = NULL;
}
+ relation->rd_rfcol_valid = rfcol_valid;
+
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ memcpy(relation->rd_pubactions, &pubactions, sizeof(PublicationActions));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return invalid_rfcolnum;
+}
+
+/*
+ * Get publication actions for the given relation.
+ */
+struct PublicationActions *
+GetRelationPublicationActions(Relation relation)
+{
+ PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+
+ /*
+ * If not publishable, it publishes no actions. (pgoutput_change() will
+ * ignore it.)
+ */
+ if (!is_publishable_relation(relation))
+ return pubactions;
+
+ if (!relation->rd_pubactions)
+ (void) RelationGetInvalidRowFilterCol(relation);
+
+ return memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
}
/*
@@ -6163,6 +6360,7 @@ load_relcache_init_file(bool shared)
rel->rd_idattr = NULL;
rel->rd_hotblockingattr = NULL;
rel->rd_pubactions = NULL;
+ rel->rd_rfcol_valid = false;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 31281279cf..27cec813c0 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -163,6 +163,13 @@ typedef struct RelationData
PublicationActions *rd_pubactions; /* publication actions */
+ /*
+ * true if the columns referenced in row filters from all the publications
+ * the relation is in are part of replica identity, or the publication
+ * actions do not include UPDATE and DELETE.
+ */
+ bool rd_rfcol_valid;
+
/*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 82316bba54..25c759f289 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern AttrNumber RelationGetInvalidRowFilterCol(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index a7729758af..2ca45d8f2c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -416,58 +416,61 @@ DROP FUNCTION testpub_rf_func99();
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+create table rf_tbl_abcd_part_pk (a int primary key, b int) partition by RANGE (a);
+create table rf_tbl_abcd_part_pk_1 (b int, a int primary key);
+alter table rf_tbl_abcd_part_pk attach partition rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
--- ok - "a" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
RESET client_min_messages;
-DROP PUBLICATION testpub6;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
-- ok - "b" is a PK col
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
-DROP PUBLICATION testpub6;
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- fail - "c" is not part of the PK
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
-- fail - "d" is not part of the PK
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "d" is not part of the REPLICA IDENTITY
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- fail - "a" is not part of REPLICA IDENTITY
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- ok - "c" is in REPLICA IDENTITY now even though not in PK
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-DROP PUBLICATION testpub6;
-RESET client_min_messages;
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- ok - "a" is in REPLICA IDENTITY now
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-DROP PUBLICATION testpub6;
-RESET client_min_messages;
+UPDATE rf_tbl_abcd_nopk SET a = 1;
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "c" is not part of the REPLICA IDENTITY
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- fail - "a" is not in REPLICA IDENTITY NOTHING
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
@@ -476,26 +479,43 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_pk" to publication
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-DROP PUBLICATION testpub6;
-RESET client_min_messages;
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- fail - "a" is not in REPLICA IDENTITY INDEX
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-ERROR: cannot add relation "rf_tbl_abcd_nopk" to publication
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
DETAIL: Row filter column "a" is not part of the REPLICA IDENTITY
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filer
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filer
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_part_pk_1"
+DETAIL: Row filter column "b" is not part of the REPLICA IDENTITY
DROP PUBLICATION testpub6;
-RESET client_min_messages;
DROP TABLE rf_tbl_abcd_pk;
DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index e8242c95ee..4278bee6fd 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -235,50 +235,53 @@ DROP FUNCTION testpub_rf_func99();
-- More row filter tests for validating column references
CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+create table rf_tbl_abcd_part_pk (a int primary key, b int) partition by RANGE (a);
+create table rf_tbl_abcd_part_pk_1 (b int, a int primary key);
+alter table rf_tbl_abcd_part_pk attach partition rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
--- ok - "a" is a PK col
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
RESET client_min_messages;
-DROP PUBLICATION testpub6;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
-- ok - "b" is a PK col
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (b > 99);
-RESET client_min_messages;
-DROP PUBLICATION testpub6;
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- fail - "c" is not part of the PK
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
-- fail - "d" is not part of the PK
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (d > 99);
+UPDATE rf_tbl_abcd_pk SET a = 1;
-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- fail - "a" is not part of REPLICA IDENTITY
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+UPDATE rf_tbl_abcd_nopk SET a = 1;
-- Case 2. REPLICA IDENTITY FULL
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- ok - "c" is in REPLICA IDENTITY now even though not in PK
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-DROP PUBLICATION testpub6;
-RESET client_min_messages;
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- ok - "a" is in REPLICA IDENTITY now
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-DROP PUBLICATION testpub6;
-RESET client_min_messages;
+UPDATE rf_tbl_abcd_nopk SET a = 1;
-- Case 3. REPLICA IDENTITY NOTHING
ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- fail - "a" is not in REPLICA IDENTITY NOTHING
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+UPDATE rf_tbl_abcd_nopk SET a = 1;
-- Case 4. REPLICA IDENTITY INDEX
ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
@@ -287,23 +290,39 @@ ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (c > 99);
-DROP PUBLICATION testpub6;
-RESET client_min_messages;
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
-- fail - "a" is not in REPLICA IDENTITY INDEX
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
-- ok - "c" is part of REPLICA IDENTITY INDEX
-SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_nopk WHERE (c > 99);
-DROP PUBLICATION testpub6;
-RESET client_min_messages;
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filer
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filer
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+DROP PUBLICATION testpub6;
DROP TABLE rf_tbl_abcd_pk;
DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
--
2.18.4
On Wednesday, December 8, 2021 7:52 PM Ajin Cherian <itsajin@gmail.com>
On Tue, Dec 7, 2021 at 5:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
We were mid-way putting together the next v45* when your latest
attachment was posted over the weekend. So we will proceed with our
original plan to post our v45* (tomorrow).After v45* is posted we will pause to find what are all the
differences between your unified patch and our v45* patch set. Our
intention is to integrate as many improvements as possible from your
changes into the v46* etc that will follow tomorrow’s v45*. On some
points, we will most likely need further discussion.Posting an update for review comments, using contributions majorly from
Peter Smith.
I've also included changes based on Euler's combined patch, specially changes
to documentation and test cases.
I have left out Hou-san's 0005, in this patch-set. Hou-san will provide a rebased
update based on this.This patch addresses the following review comments:
Hi,
Thanks for updating the patch.
I noticed a possible issue.
+ /* Check row filter. */
+ if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
...
/* Convert tuple if needed. */
if (relentry->map)
tuple = execute_attr_map_tuple(tuple, relentry->map);
Currently, we execute the row filter before converting the tuple, I think it could
get wrong result if we are executing a parent table's row filter and the column
order of the parent table is different from the child table. For example:
----
create table parent(a int primary key, b int) partition by range (a);
create table child (b int, a int primary key);
alter table parent attach partition child default;
create publication pub for table parent where(a>10) with(PUBLISH_VIA_PARTITION_ROOT);
The column number of 'a' is '1' in filter expression while column 'a' is the
second one in the original tuple. I think we might need to execute the filter
expression after converting.
Best regards,
Hou zj
PSA the v46* patch set.
Here are the main differences from v45:
0. Rebased to HEAD
1. Integrated many comments, docs, messages, code etc from Euler's
patch [Euler 6/12]
2. Several bugfixes
3. Patches are merged/added
~~
Bugfix and Patch Merge details:
v46-0001 (main)
- Merged from v45-0001 (main) + v45-0005 (exprstate)
- Fix for mem leak reported by Greg (off-list)
v46-0002 (validation)
- Merged from v45-0002 (node validation) + v45-0006 (replica identity
validation)
v46-0003
- Rebased from v45-0003
- Fix for partition column order [Houz 9/12]
- Fix for core dump reported by Tang (off-list)
v46-0004 (tab-complete and dump)
- Rebased from v45-0004
v46-0005 (for all tables)
- New patch
- Fix for FOR ALL TABLES [Tang 7/12]
------
[Euler 6/12] /messages/by-id/b676aef0-00c7-4c19-85f8-33786594e807@www.fastmail.com
[Tang 7/12] /messages/by-id/OS0PR01MB6113D82113AA081ACF710D0CFB6E9@OS0PR01MB6113.jpnprd01.prod.outlook.com
[Houz 9/12] /messages/by-id/OS0PR01MB5716EB3137D194030EB694F194709@OS0PR01MB5716.jpnprd01.prod.outlook.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v46-0003-Row-filter-updates-based-on-old-new-tuples.patchapplication/octet-stream; name=v46-0003-Row-filter-updates-based-on-old-new-tuples.patchDownload
From ac6bce5428c68a862ab8d14ea5dadd4da5ba02c3 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 13 Dec 2021 20:20:33 +1100
Subject: [PATCH v46] Row filter updates based on old/new tuples
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 38 +++--
src/backend/replication/pgoutput/pgoutput.c | 228 ++++++++++++++++++++++++----
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 235 insertions(+), 49 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..110ccff 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,13 +751,16 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
+ Datum attr_values[MaxTupleAttributeNumber];
+ bool attr_isnull[MaxTupleAttributeNumber];
desc = RelationGetDescr(rel);
@@ -771,7 +776,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (TupIsNull(slot))
+ {
+ values = attr_values;
+ isnull = attr_isnull;
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
@@ -832,6 +847,7 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
ReleaseSysCache(typtup);
}
+
}
/*
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 52ed2c6..c072b25 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -140,6 +142,9 @@ typedef struct RelationSyncEntry
ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
One per publication action. */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -174,11 +179,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ TupleTableSlot *slot, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -742,26 +751,124 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
- RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
- ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
- List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
- bool no_filter[] = {false, false, false}; /* One per pubaction */
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
changetype == REORDER_BUFFER_CHANGE_UPDATE ||
changetype == REORDER_BUFFER_CHANGE_DELETE);
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /* Clear the tuples */
+ ExecClearTuple(entry->old_tuple);
+ ExecClearTuple(entry->new_tuple);
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(changetype, relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ ListCell *lc;
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
/*
* If the row filter caching is currently flagged "invalid" then it means we
* don't know yet if there is/isn't any row filters for this relation.
@@ -931,12 +1038,35 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
}
entry->exprstate_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
/* Bail out if there is no row filter */
if (!entry->exprstate[changetype])
@@ -955,7 +1085,12 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -968,7 +1103,6 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -1026,6 +1160,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -1033,10 +1170,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
- break;
-
/*
* Schema should be sent before the logic that replaces the
* relation because it also sends the ancestor's relation.
@@ -1054,6 +1187,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, NULL, tuple,
+ NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -1065,10 +1203,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
-
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
- break;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1089,9 +1224,34 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter */
+ if (!pgoutput_row_filter_update_check(change->action, relation,
+ oldtuple, newtuple, relentry,
+ &modified_action))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1100,10 +1260,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
- break;
-
maybe_send_schema(ctx, change, relation, relentry);
/* Switch relation if publishing via root. */
@@ -1117,6 +1273,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, oldtuple,
+ NULL, NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -1519,6 +1680,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v46-0005-Row-filter-handle-FOR-ALL-TABLES.patchapplication/octet-stream; name=v46-0005-Row-filter-handle-FOR-ALL-TABLES.patchDownload
From 59f959ff6a9f81e1c152ca7f2351d2f6e4e7c886 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 13 Dec 2021 20:39:33 +1100
Subject: [PATCH v46] Row filter handle FOR ALL TABLES
If one of the subscriber's publications was created using FOR ALL TABLES then
that implies NO row-filtering will be applied.
This overides any other row-filters from other subscribed publications.
Note that the initial COPY does not take publication operations into account.
Author: Peter Smith
Reported By: Tang
Discussion: https://www.postgresql.org/message-id/OS0PR01MB6113D82113AA081ACF710D0CFB6E9%40OS0PR01MB6113.jpnprd01.prod.outlook.com
TODO
- PG docs
- TAP test cases
- Similar case for FOR ALL TABLES IN SCHEMA?
---
src/backend/replication/logical/tablesync.c | 64 ++++++++++++++++++++++++++---
src/backend/replication/pgoutput/pgoutput.c | 18 ++++++++
2 files changed, 76 insertions(+), 6 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 971e037..1f80261 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -700,6 +700,7 @@ fetch_remote_table_info(char *nspname, char *relname,
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid alltablesRow[] = {BOOLOID};
Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
@@ -802,12 +803,67 @@ fetch_remote_table_info(char *nspname, char *relname,
walrcv_clear_result(res);
/*
+ * If any publication has puballtable true then all row-filtering is
+ * ignored.
+ *
* Get relation qual. DISTINCT avoids the same expression of a table in
* multiple publications from being included multiple times in the final
* expression.
*/
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
{
+ bool puballtables = false;
+
+ /*
+ * Check for puballtables flag
+ */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT p.puballtables "
+ " FROM pg_publication p "
+ " WHERE p.puballtables IS TRUE "
+ " AND p.pubname IN (");
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, alltablesRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch puballtables flag for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ puballtables = tuplestore_gettupleslot(res->tuplestore, true, false, slot);
+ ExecDropSingleTupleTableSlot(slot);
+ walrcv_clear_result(res);
+
+ if (puballtables)
+ {
+ if (*qual)
+ {
+ list_free_deep(*qual);
+ *qual = NIL;
+ }
+ pfree(cmd.data);
+ return;
+ }
+
+ /*
+ * Check for row-filters
+ */
resetStringInfo(&cmd);
appendStringInfo(&cmd,
"SELECT DISTINCT pg_get_expr(prqual, prrelid) "
@@ -847,18 +903,14 @@ fetch_remote_table_info(char *nspname, char *relname,
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- Datum rf = slot_getattr(slot, 1, &isnull);
+ Datum rf = slot_getattr(slot, 1, &isnull);
if (!isnull)
*qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
ExecClearTuple(slot);
- /*
- * One entry without a row filter expression means clean up
- * previous expressions (if there are any) and return with no
- * expressions.
- */
+ /* Ignore filters and cleanup as necessary. */
if (isnull)
{
if (*qual)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index c072b25..1a13d70 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -917,6 +917,8 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
* relation. Since row filter usage depends on the DML operation,
* there are multiple lists (one for each operation) which row filters
* will be appended.
+ *
+ * NOTE: FOR ALL TABLES implies "use no filters" so it takes precedence
*/
foreach(lc, data->publications)
{
@@ -926,6 +928,22 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
bool rfisnull;
/*
+ * If the publication is FOR ALL TABLES then it is treated same as if this
+ * table has no filters (even if for some other publication it does).
+ */
+ if (pub->alltables)
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ continue;
+ }
+
+ /*
* Lookup if there is a row-filter, and if yes remember it in a list (per
* pubaction). If no, then remember there was no filter for this pubaction.
* Code following this 'publications' loop will combine all filters.
--
1.8.3.1
v46-0004-Row-filter-tab-auto-complete-and-pgdump.patchapplication/octet-stream; name=v46-0004-Row-filter-tab-auto-complete-and-pgdump.patchDownload
From 0bbb437d472563ae9115a1b011dae56e365ade31 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 13 Dec 2021 20:26:19 +1100
Subject: [PATCH v46] Row filter tab auto-complete and pgdump
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 24 ++++++++++++++++++++++--
3 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 10a86f9..e595c7f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4265,6 +4265,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4275,9 +4276,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4286,6 +4294,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4326,6 +4335,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4393,8 +4406,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 6dccb4b..74f82cd 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -633,6 +633,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca..8b2d0fd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,19 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /*
+ * "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with
+ * table attributes
+ *
+ * "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,13 +2790,20 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
/*
+ * "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
+
+ /*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
* ..."
*/
--
1.8.3.1
v46-0002-Row-filter-validation.patchapplication/octet-stream; name=v46-0002-Row-filter-validation.patchDownload
From 020e95cdf34e2c9b415b5649a49724b16f21f15a Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 13 Dec 2021 20:04:42 +1100
Subject: [PATCH v46] Row filter validation
This patch implements parse-tree "walkers" to validate a row-filter.
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifically:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system functions (unless they are immutable). See design decision at [1].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
Permits only simple nodes including:
List, Const, BoolExpr, NullIfExpr, NullTest, BooleanTest, CoalesceExpr,
CaseExpr, CaseTestExpr, MinMaxExpr, ArrayExpr, ScalarArrayOpExpr, XmlExpr.
Author: Peter Smith, Euler Taveira
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" "update", validate that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Row filter columns invalidation is done in CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on
the published relation. This is consistent with the existing check about
replica identity and can detect the change related to the row filter in time.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It is safe to do this because every
operation that change the row filter and replica identity will invalidate the
relcache.
Author: Hou zj
---
src/backend/catalog/pg_publication.c | 109 ++++++++++++-
src/backend/executor/execReplication.c | 36 +++-
src/backend/parser/parse_agg.c | 8 +-
src/backend/parser/parse_expr.c | 19 +--
src/backend/parser/parse_func.c | 3 +-
src/backend/parser/parse_oper.c | 7 -
src/backend/utils/cache/relcache.c | 262 ++++++++++++++++++++++++++----
src/include/utils/rel.h | 7 +
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 223 ++++++++++++++++++++-----
src/test/regress/sql/publication.sql | 174 +++++++++++++++++---
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
13 files changed, 721 insertions(+), 136 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index bc5f6a2..2a4ee71 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -112,6 +114,102 @@ check_publication_add_schema(Oid schemaid)
}
/*
+ * The row filter walker checks if the row filter expression is a "simple
+ * expression".
+ *
+ * It allows only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - Non-immutable builtin functions are not allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-definition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because
+ * non-immutable functions can access the database and would lead to the problem
+ * (b) mentioned in the previous paragraph.
+ */
+static bool
+rowfilter_walker(Node *node, Relation relation)
+{
+ char *errdetail_msg = NULL;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ errdetail_msg = _("user-defined types are not allowed");
+ }
+ else if (IsA(node, List) || IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr) ||
+ IsA(node, NullTest) || IsA(node, BooleanTest) || IsA(node, CoalesceExpr) ||
+ IsA(node, CaseExpr) || IsA(node, CaseTestExpr) || IsA(node, MinMaxExpr) ||
+ IsA(node, ArrayExpr) || IsA(node, ScalarArrayOpExpr) || IsA(node, XmlExpr))
+ {
+ /* OK, these nodes are part of simple expressions */
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *) node)->opno >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined operators are not allowed.");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *) node)->funcid;
+ const char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ errdetail_msg = psprintf(_("User-defined functions are not allowed (%s)."),
+ funcname);
+ else if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ errdetail_msg = psprintf(_("Non-immutable built-in functions are not allowed (%s)."),
+ funcname);
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("Expressions only allow columns, constants and some built-in functions and operators.")
+ ));
+ }
+
+ if (errdetail_msg)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("%s", errdetail_msg)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)relation);
+}
+
+/*
* Returns if relation represented by oid and Form_pg_class entry
* is publishable.
*
@@ -241,10 +339,6 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
- */
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -355,6 +449,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
/* Fix up collation information */
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d2..2cbe2aa 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -568,14 +568,46 @@ void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
PublicationActions *pubactions;
+ AttrNumber invalid_rfcolnum;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ return;
+
+ invalid_rfcolnum = RelationGetInvalidRowFilterCol(rel);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns referenced in
+ * the row filters from publications which the relation is in are valid,
+ * which means all referenced columns are part of REPLICA IDENTITY, or the
+ * table do not publish UPDATES or DELETES.
+ */
+ if (invalid_rfcolnum)
+ {
+ const char *colname = get_attname(RelationGetRelid(rel),
+ invalid_rfcolnum, false);
+
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ }
+
/* If relation has replica identity we are always good. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
- OidIsValid(RelationGetReplicaIndex(rel)))
+ if (OidIsValid(RelationGetReplicaIndex(rel)))
return;
/*
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..0d39cfe 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -552,11 +552,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
+ /* okay (see function rowfilter_walker in pg_publication.c). */
break;
case EXPR_KIND_CYCLE_MARK:
@@ -951,7 +947,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
+ /* okay (see function rowfilter_walker in pg_publication.c). */
break;
/*
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..7933387 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -1777,7 +1766,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
err = _("cannot use subquery in column generation expression");
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
+ /* okay (see function rowfilter_walker in pg_publication.c). */
break;
/*
@@ -3100,7 +3089,7 @@ ParseExprKindName(ParseExprKind exprKind)
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
+ return "publication WHERE expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..2b2486a 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,7 +2656,8 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
errkind = true;
break;
case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
+ /* okay (see function rowfilter_walker in pg_publication.c). */
+ pstate->p_hasTargetSRFs = true;
break;
/*
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 105d8d4..ed04881 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -71,6 +72,8 @@
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_relation.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rowsecurity.h"
#include "storage/lmgr.h"
@@ -84,6 +87,7 @@
#include "utils/memutils.h"
#include "utils/relmapper.h"
#include "utils/resowner_private.h"
+#include "utils/ruleutils.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
@@ -5521,57 +5525,169 @@ RelationGetExclusionInfo(Relation indexRelation,
MemoryContextSwitchTo(oldcxt);
}
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcolnum; /* invalid column number */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+ bool pubviaroot; /* true if we are validating the parent
+ * relation's row filter */
+ Oid relid; /* relid of the relation */
+ Oid parentid; /* relid of the parent relation */
+} rf_context;
+
/*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row-filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcolnum.
*/
-struct PublicationActions *
-GetRelationPublicationActions(Relation relation)
+static bool
+invalid_rowfilter_column_walker(Node *node, rf_context *context)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ /*
+ * If pubviaroot is true, we need to convert the column number of
+ * parent to the column number of child relation first.
+ */
+ if (context->pubviaroot)
+ {
+ char *colname = get_attname(context->parentid, attnum, false);
+ attnum = get_attnum(context->relid, colname);
+ }
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ context->bms_replident))
+ {
+ context->invalid_rfcolnum = attnum;
+ return true;
+ }
+ }
+
+ return expression_tree_walker(node, invalid_rowfilter_column_walker,
+ (void *) context);
+}
+
+/*
+ * Append to cur_puboids each member of add_puboids that isn't already in
+ * cur_puboids.
+ *
+ * Also update the top most parent relation's relid in the publication.
+ */
+static void
+concat_publication_oid(Oid relid,
+ List **cur_puboids,
+ List **toprelid_in_pub,
+ const List *add_puboids)
+{
+ ListCell *lc1,
+ *lc2,
+ *lc3;
+
+ foreach(lc1, add_puboids)
+ {
+ bool is_member = false;
+
+ forboth(lc2, *cur_puboids, lc3, *toprelid_in_pub)
+ {
+ if (lfirst_oid(lc2) == lfirst_oid(lc1))
+ {
+ is_member = true;
+ lfirst_oid(lc3) = relid;
+ }
+ }
+
+ if (!is_member)
+ {
+ *cur_puboids = lappend_oid(*cur_puboids, lfirst_oid(lc1));
+ *toprelid_in_pub = lappend_oid(*toprelid_in_pub, relid);
+ }
+ }
+}
+
+/*
+ * Get the invalid row filter column number for the given relation.
+ *
+ * Traverse all the publications which the relation is in to get the
+ * publication actions. If the publication actions include UPDATE or DELETE,
+ * then validate that if all columns referenced in the row filter expression
+ * are part of REPLICA IDENTITY.
+ *
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, otherwise InvalidAttrNumber.
+ */
+AttrNumber
+RelationGetInvalidRowFilterCol(Relation relation)
+{
+ List *puboids,
+ *toprelid_in_pub;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ Oid relid = RelationGetRelid(relation);
+ rf_context context = { 0 };
+ PublicationActions pubactions = { 0 };
+ bool rfcol_valid = true;
+ AttrNumber invalid_rfcolnum = InvalidAttrNumber;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
- if (!is_publishable_relation(relation))
- return pubactions;
-
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (!is_publishable_relation(relation) || relation->rd_rfcol_valid)
+ return invalid_rfcolnum;
/* Fetch the publication membership info. */
- puboids = GetRelationPublications(RelationGetRelid(relation));
+ toprelid_in_pub = puboids = NIL;
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetRelationPublications(relid));
schemaid = RelationGetNamespace(relation);
- puboids = list_concat_unique_oid(puboids, GetSchemaPublications(schemaid));
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
if (relation->rd_rel->relispartition)
{
/* Add publications that the ancestors are in too. */
- List *ancestors = get_partition_ancestors(RelationGetRelid(relation));
+ List *ancestors = get_partition_ancestors(relid);
ListCell *lc;
foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
- puboids = list_concat_unique_oid(puboids,
- GetRelationPublications(ancestor));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetRelationPublications(ancestor));
schemaid = get_rel_namespace(ancestor);
- puboids = list_concat_unique_oid(puboids,
- GetSchemaPublications(schemaid));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
}
+
+ relid = llast_oid(ancestors);
}
- puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetAllTablesPublications());
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTITY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5581,35 +5697,116 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubactions.pubinsert |= pubform->pubinsert;
+ pubactions.pubupdate |= pubform->pubupdate;
+ pubactions.pubdelete |= pubform->pubdelete;
+ pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UPDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ *
+ * If we already found the column in row filter which is not part
+ * of REPLICA IDENTITY index, skip the validation too.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL &&
+ rfcol_valid)
+ {
+ HeapTuple rftuple;
+
+ if (pubform->pubviaroot)
+ relid = list_nth_oid(toprelid_in_pub,
+ foreach_current_index(lc));
+ else
+ relid = RelationGetRelid(relation);
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ context.pubviaroot = pubform->pubviaroot;
+ context.parentid = relid;
+ context.relid = RelationGetRelid(relation);
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !invalid_rowfilter_column_walker(rfnode,
+ &context);
+ invalid_rfcolnum = context.invalid_rfcolnum;
+ pfree(rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
+ */
+ if (pubactions.pubinsert && pubactions.pubupdate &&
+ pubactions.pubdelete && pubactions.pubtruncate &&
+ !rfcol_valid)
break;
}
+ bms_free(context.bms_replident);
+
if (relation->rd_pubactions)
{
pfree(relation->rd_pubactions);
relation->rd_pubactions = NULL;
}
+ relation->rd_rfcol_valid = rfcol_valid;
+
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ memcpy(relation->rd_pubactions, &pubactions, sizeof(PublicationActions));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return invalid_rfcolnum;
+}
+
+/*
+ * Get publication actions for the given relation.
+ */
+struct PublicationActions *
+GetRelationPublicationActions(Relation relation)
+{
+ PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+
+ /*
+ * If not publishable, it publishes no actions. (pgoutput_change() will
+ * ignore it.)
+ */
+ if (!is_publishable_relation(relation))
+ return pubactions;
+
+ if (!relation->rd_pubactions)
+ (void) RelationGetInvalidRowFilterCol(relation);
+
+ return memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
}
/*
@@ -6163,6 +6360,7 @@ load_relcache_init_file(bool shared)
rel->rd_idattr = NULL;
rel->rd_hotblockingattr = NULL;
rel->rd_pubactions = NULL;
+ rel->rd_rfcol_valid = false;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 3128127..27cec81 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -164,6 +164,13 @@ typedef struct RelationData
PublicationActions *rd_pubactions; /* publication actions */
/*
+ * true if the columns referenced in row filters from all the publications
+ * the relation is in are part of replica identity, or the publication
+ * actions do not include UPDATE and DELETE.
+ */
+ bool rd_rfcol_valid;
+
+ /*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
* defaults".
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 82316bb..25c759f 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern AttrNumber RelationGetInvalidRowFilterCol(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index f93a63d..69ed2c8 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -243,18 +243,21 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -264,7 +267,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -275,7 +278,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
@@ -286,7 +289,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
@@ -308,83 +311,221 @@ Publications:
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (e < 999)
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
- "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+ "testpub_rf_schema1.testpub_rf_tbl5" WHERE (h < 999)
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
ERROR: syntax error at or near "WHERE"
-LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+LINE 1: ...ntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a =...
^
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
ERROR: WHERE clause for schema not allowed
-LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf...
^
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: User-defined operators are not allowed.
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: User-defined functions are not allowed (testpub_rf_func2).
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Non-immutable built-in functions are not allowed (random).
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: user-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Expressions only allow columns, constants and some built-in functions and operators.
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
-ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
-DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_schema2.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_schema2" is already part of the publication or part of the specified schema list.
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "d" used in the publication WHERE expression is not part of the replica identity.
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_part_pk_1"
+DETAIL: Column "b" used in the publication WHERE expression is not part of the replica identity.
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index c13ccc5..0985681 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -138,12 +138,15 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -162,51 +165,174 @@ RESET client_min_messages;
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
RESET client_min_messages;
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
v46-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v46-0001-Row-filter-for-logical-replication.patchDownload
From 057a96c6f26fc01a8310f529d2f4fffa381b4fdf Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 13 Dec 2021 19:49:55 +1100
Subject: [PATCH v46] Row filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. If the row filter evaluates to NULL,
it returns false. The WHERE clause allows simple expressions. Simple expressions
cannot contain any aggregate or window functions, non-immutable functions,
user-defined types, operators or functions. This restriction could possibly be
addressed in the future.
If you choose to do the initial table synchronization, only data that satisfies
the row filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expressions will be copied. If a subscriber is a
pre-15 version, the initial table synchronization won't use row filters even
if they are defined in the publisher.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
Psql commands \dRp+ and \d+ will display any row filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Cache ExprState per pubaction.
If a subscriber has multiple publications and these publications include the
same table then there can be multiple filters that apply to that table.
These filters are stored per-pubactions of the publications. There are 4 kinds
of pubaction ("insert", "update", "delete", "truncate"), but row-filters are
not applied for "truncate".
Filters for the same pubaction are all combined (OR'ed) and cached as one, so
at the end there are at most 3 cached filters per table.
The appropriate (pubaction) filter is executed according to the DML operation.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1%2BhVXfOSScbf5LUB%3D5is%3DwYaC6NBhLxuvetbWQnZRnsVQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 37 ++-
doc/src/sgml/ref/create_subscription.sgml | 22 +-
src/backend/catalog/pg_publication.c | 62 +++-
src/backend/commands/publicationcmds.c | 108 ++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/parser/parse_relation.c | 9 +
src/backend/replication/logical/tablesync.c | 116 +++++++-
src/backend/replication/pgoutput/pgoutput.c | 421 +++++++++++++++++++++++++++-
src/bin/psql/describe.c | 26 +-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 146 ++++++++++
src/test/regress/sql/publication.sql | 74 +++++
src/test/subscription/t/027_row_filter.pl | 357 +++++++++++++++++++++++
24 files changed, 1445 insertions(+), 50 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 025db98..4433595 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6311,6 +6311,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..5d9869c 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e..5aeee23 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -79,6 +79,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ If the optional <literal>WHERE</literal> clause is specified, only rows
+ that satisfy the <replaceable class="parameter">expression</replaceable>
+ will be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
+ </para>
+
+ <para>
Only persistent base tables and partitioned tables can be part of a
publication. Temporary tables, unlogged tables, foreign tables,
materialized views, and regular views cannot be part of a publication.
@@ -226,6 +233,22 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ A <literal>WHERE</literal> clause must contain only columns that are
+ part of the primary key or are covered by the <literal>REPLICA
+ IDENTITY</literal>, in order for <command>UPDATE</command> and
+ <command>DELETE</command> operations to be published.
+ For publication of <command>INSERT</command> operations, any column
+ may be used in the <literal>WHERE</literal> clause.
+ If nullable columns are present in the <literal>WHERE</literal> clause,
+ possible NULL values should be accounted for in expressions, to avoid
+ unexpected results, because <literal>NULL</literal> values can cause
+ those expressions to evaluate to false.
+ A <literal>WHERE</literal> clause allows simple expressions. The simple
+ expression cannot contain any aggregate or window functions, non-immutable
+ functions, user-defined types, operators or functions.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -247,6 +270,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -260,6 +288,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..d5c96e0 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ If the publications contain conditional expressions, it will affect
+ what data is copied. Refer to the
+ <xref linkend="sql-createsubscription-notes" /> for details.
+ </para>
</listitem>
</varlistentry>
@@ -293,7 +298,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</variablelist>
</refsect1>
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
<title>Notes</title>
<para>
@@ -319,6 +324,21 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published (i.e. they will be filtered out).
+ If the subscription has several publications in which the same table has been
+ published with different <literal>WHERE</literal> clauses, those expressions
+ (for the same publish operation) get OR'ed together so that rows satisfying any
+ of the expressions will be published. Notice this means if one of the publications
+ has no <literal>WHERE</literal> clause at all then all other <literal>WHERE</literal>
+ clauses (for the same publish operation) become redundant.
+ If the subscriber is a <productname>PostgreSQL</productname> version before 15
+ then any row filtering is ignored during the initial data synchronization phase.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 62f10bc..bc5f6a2 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -275,22 +278,51 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
return result;
}
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *transformedwhereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock,
+ NULL, false, false);
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ transformedwhereclause = transformWhereClause(pstate,
+ copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (bfixupcollation)
+ assign_expr_collations(pstate, transformedwhereclause);
+
+ return transformedwhereclause;
+}
+
/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -311,10 +343,19 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /* Fix up collation information */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -328,6 +369,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -344,6 +391,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 404bb5d..9ca743c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,96 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+ Node *oldrelwhereclause = NULL;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
+ if (oldrelwhereclause)
+ pfree(oldrelwhereclause);
- delrels = lappend(delrels, pubrel);
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
+ if (!found)
+ {
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +955,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +983,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1035,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1044,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1064,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1161,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df0b747..bd55ea6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4833,6 +4833,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index cb7ddd4..028b8e5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 3d4dd43..9da93a0
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9742,12 +9742,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9762,28 +9763,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17431,7 +17449,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17444,6 +17463,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index c5c3f26..f66243e 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,11 +3538,20 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
+ {
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_TABLE),
+ errmsg("publication row-filter WHERE invalid reference to table \"%s\"",
+ relation->relname),
+ parser_errposition(pstate, relation->location)));
+
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
+ }
}
/*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..971e037 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -688,19 +688,23 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row filter expressions for the same table will be combined
+ * by COPY using OR. If any of the filter expressions for this table are
+ * null, it means the whole table will be copied. In this case it is not
+ * necessary to construct a unified row filter expression at all.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ /*
+ * One entry without a row filter expression means clean up
+ * previous expressions (if there are any) and return with no
+ * expressions.
+ */
+ if (isnull)
+ {
+ if (*qual)
+ {
+ list_free_deep(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..52ed2c6 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,24 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * ExprState cannot be used to indicate no cache, invalid cache and valid
+ * cache, so the flag exprstate_valid indicates if the current cache is
+ * valid.
+ *
+ * Multiple ExprState entries might be used if there are multiple
+ * publications for a single table. Different publication actions don't
+ * allow multiple expressions to always be combined into one, so there is
+ * one ExprSTate per publication action. Only 3 publication actions are used
+ * for row filtering ("insert", "update", "delete"). The exprstate array is
+ * indexed by ReorderBufferChangeType.
+ */
+ bool exprstate_valid;
+#define IDX_PUBACTION_n 3
+ ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
+ One per publication action. */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +163,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +172,14 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +655,327 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->exprstate_valid)
+ {
+ MemoryContext oldctx;
+ int idx_ins = REORDER_BUFFER_CHANGE_INSERT;
+ int idx_upd = REORDER_BUFFER_CHANGE_UPDATE;
+ int idx_del = REORDER_BUFFER_CHANGE_DELETE;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple publications might have multiple row filters for this
+ * relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list (per
+ * pubaction). If no, then remember there was no filter for this pubaction.
+ * Code following this 'publications' loop will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ /* Gather the rfnodes per pubaction of this publiaction. */
+ if (pub->pubactions.pubinsert)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_ins] = lappend(rfnodes[idx_ins], rfnode);
+ }
+ if (pub->pubactions.pubupdate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_upd] = lappend(rfnodes[idx_upd], rfnode);
+ }
+ if (pub->pubactions.pubdelete)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_del] = lappend(rfnodes[idx_del], rfnode);
+ }
+ MemoryContextSwitchTo(oldctx);
+ }
+ else
+ {
+ /* Remember which pubactions have no row-filter. */
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Now all the filters for all pubactions are known. Combine them when their
+ * pubactions are same.
+ *
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because
+ * all expressions are aggregated by the OR operator. The row
+ * filter absence means replicate all rows so a single valid
+ * expression means publish this row.
+ */
+ {
+ int idx;
+ bool found_filters = false;
+
+ /* For each pubaction... */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ int n_filters;
+
+ /*
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because
+ * all expressions are aggregated by the OR operator. The row
+ * filter absence means replicate all rows so a single valid
+ * expression means publish this row.
+ */
+ if (no_filter[idx])
+ {
+ if (rfnodes[idx])
+ {
+ list_free_deep(rfnodes[idx]);
+ rfnodes[idx] = NIL;
+ }
+ }
+
+ /*
+ * If there was one or more filter for this pubaction then combine them
+ * (if necessary) and cache the ExprState.
+ */
+ n_filters = list_length(rfnodes[idx]);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) : linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ found_filters = true; /* flag that we will need slots made */
+ }
+ } /* for each pubaction */
+
+ if (found_filters)
+ {
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+ }
+
+ entry->exprstate_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate (for this pubaction).
+ */
+ if (entry->exprstate[changetype])
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[changetype], ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +1002,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +1026,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +1033,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +1066,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1100,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1169,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1491,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1515,13 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->exprstate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1626,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1310,6 +1688,7 @@ static void
rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ int idx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1354,6 +1733,25 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->exprstate_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ /* Cleanup the ExprState for each of the pubactions. */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
+ }
}
}
@@ -1365,6 +1763,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1773,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1793,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ea721d9..fb5cfc5 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,21 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
"WHERE pr.prrelid = '%s'\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3200,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ /* row filter (if any) */
+ if (pset.sversion >= 150000)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE %s", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6331,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE %s", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6465,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 902f2f2..96c55f6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -123,13 +125,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
extern bool is_publishable_relation(Relation rel);
extern bool is_schema_publication(Oid pubid);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4c5a8a3..e437a55 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3642,6 +3642,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5ac2d66..f93a63d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,152 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub_dplus_rf_no"
+ "testpub_dplus_rf_yes" WHERE (a > 1)
+
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (e < 999)
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 56dd358..c13ccc5 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,80 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
On Thu, Dec 9, 2021 at 1:37 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Wednesday, December 8, 2021 7:52 PM Ajin Cherian <itsajin@gmail.com>
On Tue, Dec 7, 2021 at 5:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
We were mid-way putting together the next v45* when your latest
attachment was posted over the weekend. So we will proceed with our
original plan to post our v45* (tomorrow).After v45* is posted we will pause to find what are all the
differences between your unified patch and our v45* patch set. Our
intention is to integrate as many improvements as possible from your
changes into the v46* etc that will follow tomorrow’s v45*. On some
points, we will most likely need further discussion.Posting an update for review comments, using contributions majorly from
Peter Smith.
I've also included changes based on Euler's combined patch, specially changes
to documentation and test cases.
I have left out Hou-san's 0005, in this patch-set. Hou-san will provide a rebased
update based on this.This patch addresses the following review comments:
Hi,
Thanks for updating the patch.
I noticed a possible issue.+ /* Check row filter. */ + if (!pgoutput_row_filter(data, relation, oldtuple, NULL, relentry)) + break; + + maybe_send_schema(ctx, change, relation, relentry); + /* Switch relation if publishing via root. */ if (relentry->publish_as_relid != RelationGetRelid(relation)) { ... /* Convert tuple if needed. */ if (relentry->map) tuple = execute_attr_map_tuple(tuple, relentry->map);Currently, we execute the row filter before converting the tuple, I think it could
get wrong result if we are executing a parent table's row filter and the column
order of the parent table is different from the child table. For example:----
create table parent(a int primary key, b int) partition by range (a);
create table child (b int, a int primary key);
alter table parent attach partition child default;
create publication pub for table parent where(a>10) with(PUBLISH_VIA_PARTITION_ROOT);The column number of 'a' is '1' in filter expression while column 'a' is the
second one in the original tuple. I think we might need to execute the filter
expression after converting.
Fixed in v46* [1]/messages/by-id/CAHut+Ptoxjo6hpDFTya6WYH-zdspKQ5j+wZHBRc6EZkAkq7Nfw@mail.gmail.com
------
[1]: /messages/by-id/CAHut+Ptoxjo6hpDFTya6WYH-zdspKQ5j+wZHBRc6EZkAkq7Nfw@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Dec 7, 2021 at 5:48 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
...
Thanks for looking into it.
I have another problem with your patch. The document says:
... If the subscription has several publications in + which the same table has been published with different filters, those + expressions get OR'ed together so that rows satisfying any of the expressions + will be replicated. Notice this means if one of the publications has no filter + at all then all other filters become redundant.Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
ALL TABLES IN SCHEMA'.For example:
create table tbl (a int primary key);"
create publication p1 for table tbl where (a > 10);
create publication p2 for all tables;
create subscription sub connection 'dbname=postgres port=5432' publication p1, p2;I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?But for now, the filter(a > 10) works both when copying initial data and later changes.
To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.
Partly fixed in v46-0005 [1]/messages/by-id/CAHut+Ptoxjo6hpDFTya6WYH-zdspKQ5j+wZHBRc6EZkAkq7Nfw@mail.gmail.com
NOTE
- The initial COPY part of the tablesync does not take the publish
operation into account so it means that if any of the subscribed
publications have "puballtables" flag then all data will be copied
sans filters. I guess this is consistent with the other decision to
ignore publication operations [2]/messages/by-id/CAA4eK1L3r+URSLFotOT5Y88ffscCskRoGC15H3CSAU1jj_0Rdg@mail.gmail.com.
TODO
- Documentation
- IIUC there is a similar case yet to be addressed - FOR ALL TABLES IN SCHEMA
------
[1]: /messages/by-id/CAHut+Ptoxjo6hpDFTya6WYH-zdspKQ5j+wZHBRc6EZkAkq7Nfw@mail.gmail.com
[2]: /messages/by-id/CAA4eK1L3r+URSLFotOT5Y88ffscCskRoGC15H3CSAU1jj_0Rdg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Dec 14, 2021 at 4:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Dec 7, 2021 at 5:48 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?But for now, the filter(a > 10) works both when copying initial data and later changes.
To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.Partly fixed in v46-0005 [1]
NOTE
- The initial COPY part of the tablesync does not take the publish
operation into account so it means that if any of the subscribed
publications have "puballtables" flag then all data will be copied
sans filters.
I think this should be okay but the way you have implemented it in the
patch doesn't appear to be the optimal way. Can't we fetch
allpubtables info and qual info as part of one query instead of using
separate queries?
I guess this is consistent with the other decision to
ignore publication operations [2].TODO
- Documentation
- IIUC there is a similar case yet to be addressed - FOR ALL TABLES IN SCHEMA
Yeah, "FOR ALL TABLES IN SCHEMA" should also be addressed. In this
case, the difference would be that we need to check the presence of
schema corresponding to the table (for which we are fetching
row_filter information) is there in pg_publication_namespace. If it
exists then we don't need to apply row_filter for the table. I feel it
is better to fetch all this information as part of the query which you
are using to fetch row_filter info. The idea is to avoid the extra
round-trip between subscriber and publisher.
Few other comments:
===================
1.
@@ -926,6 +928,22 @@ pgoutput_row_filter_init(PGOutputData *data,
Relation relation, RelationSyncEntr
bool rfisnull;
/*
+ * If the publication is FOR ALL TABLES then it is treated same as if this
+ * table has no filters (even if for some other publication it does).
+ */
+ if (pub->alltables)
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ continue;
+ }
Is there a reason to continue checking the other publications if
no_filter is true for all kind of pubactions?
2.
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because
+ * all expressions are aggregated by the OR operator. The row
+ * filter absence means replicate all rows so a single valid
+ * expression means publish this row.
This same comment is at two places, remove from one of the places. I
think keeping it atop for loop is better.
3.
+ {
+ int idx;
+ bool found_filters = false;
I am not sure if starting such ad-hoc braces in the code to localize
the scope of variables is a regular practice. Can we please remove
this?
--
With Regards,
Amit Kapila.
On Tue, Dec 14, 2021 at 10:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 4:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
Few other comments:
===================
Few more comments:
==================
v46-0001/0002
===============
1. After rowfilter_walker() why do we need
EXPR_KIND_PUBLICATION_WHERE? I thought this is primarily to identify
the expressions that are not allowed in rowfilter which we are now
able to detect upfront with the help of a walker. Can't we instead use
EXPR_KIND_WHERE?
2.
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool bfixupcollation)
Can we add comments atop this function?
3. In GetTransformedWhereClause, can we change the name of variables
(a) bfixupcollation to fixup_collation or assign_collation, (b)
transformedwhereclause to whereclause. I think that will make the
function more readable.
v46-0002
========
4.
+ else if (IsA(node, List) || IsA(node, Const) || IsA(node, BoolExpr)
|| IsA(node, NullIfExpr) ||
+ IsA(node, NullTest) || IsA(node, BooleanTest) || IsA(node, CoalesceExpr) ||
+ IsA(node, CaseExpr) || IsA(node, CaseTestExpr) || IsA(node, MinMaxExpr) ||
+ IsA(node, ArrayExpr) || IsA(node, ScalarArrayOpExpr) || IsA(node, XmlExpr))
Can we move this to a separate function say IsValidRowFilterExpr() or
something on those lines and use Switch (nodetag(node)) to identify
these nodes?
--
With Regards,
Amit Kapila.
On Tue, Dec 14, 2021 at 10:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 10:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 4:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
Few other comments:
===================Few more comments:
==================
v46-0001/0002
===============
1. After rowfilter_walker() why do we need
EXPR_KIND_PUBLICATION_WHERE? I thought this is primarily to identify
the expressions that are not allowed in rowfilter which we are now
able to detect upfront with the help of a walker. Can't we instead use
EXPR_KIND_WHERE?
FYI - I have tried this locally and all tests pass.
~~
If the EXPR_KIND_PUBLICATION_WHERE is removed then there will be some
differences:
- we would get errors for aggregate/grouping functions from the EXPR_KIND_WHERE
- we would get errors for windows functions from the EXPR_KIND_WHERE
- we would get errors for set-returning functions from the EXPR_KIND_WHERE
Actually, IMO this would be a *good* change because AFAIK those are
not all being checked by the row-filter walker. I think the only
reason all tests pass is that there are no specific regression tests
for these cases.
OTOH, there would also be a difference where an error message would
not be as nice. Please see the review comment from Vignesh. [1]/messages/by-id/CALDaNm08Ynr_FzNg+doHj=_nBet+KZAvNbqmkEEw7M2SPpPEAw@mail.gmail.com The
improved error message is only possible by checking the
EXPR_KIND_PUBLICATION_WHERE.
~~
I think the best thing to do here is to leave the
EXPR_KIND_PUBLICATION_WHERE but simplify code so that the improved
error message remains as the *only* difference in behaviour from the
EXPR_KIND_WHERE. i.e. we should let the other
aggregate/grouping/windows/set function checks give errors exactly the
same as for the EXPR_KIND_WHERE case.
------
[1]: /messages/by-id/CALDaNm08Ynr_FzNg+doHj=_nBet+KZAvNbqmkEEw7M2SPpPEAw@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Dec 15, 2021 at 6:47 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Dec 14, 2021 at 10:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 10:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 4:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
Few other comments:
===================Few more comments:
==================
v46-0001/0002
===============
1. After rowfilter_walker() why do we need
EXPR_KIND_PUBLICATION_WHERE? I thought this is primarily to identify
the expressions that are not allowed in rowfilter which we are now
able to detect upfront with the help of a walker. Can't we instead use
EXPR_KIND_WHERE?FYI - I have tried this locally and all tests pass.
~~
If the EXPR_KIND_PUBLICATION_WHERE is removed then there will be some
differences:
- we would get errors for aggregate/grouping functions from the EXPR_KIND_WHERE
- we would get errors for windows functions from the EXPR_KIND_WHERE
- we would get errors for set-returning functions from the EXPR_KIND_WHEREActually, IMO this would be a *good* change because AFAIK those are
not all being checked by the row-filter walker. I think the only
reason all tests pass is that there are no specific regression tests
for these cases.OTOH, there would also be a difference where an error message would
not be as nice. Please see the review comment from Vignesh. [1] The
improved error message is only possible by checking the
EXPR_KIND_PUBLICATION_WHERE.~~
I think the best thing to do here is to leave the
EXPR_KIND_PUBLICATION_WHERE but simplify code so that the improved
error message remains as the *only* difference in behaviour from the
EXPR_KIND_WHERE. i.e. we should let the other
aggregate/grouping/windows/set function checks give errors exactly the
same as for the EXPR_KIND_WHERE case.
I am not sure if "the better error message" is a good enough reason
to introduce this new kind. I thought it is better to deal with that
in rowfilter_walker.
--
With Regards,
Amit Kapila.
On Mon, Dec 13, 2021 at 8:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v46* patch set.
0001
(1)
"If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher."
Won't this lead to data inconsistencies or errors that otherwise
wouldn't happen? Should such subscriptions be allowed?
(2) In the 0001 patch comment, the term "publication filter" is used
in one place, and in others "row filter" or "row-filter".
src/backend/catalog/pg_publication.c
(3) GetTransformedWhereClause() is missing a function comment.
(4)
The following comment seems incomplete:
+ /* Fix up collation information */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
src/backend/parser/parse_relation.c
(5)
wording? consistent?
Shouldn't it be "publication WHERE expression" for consistency?
+ errmsg("publication row-filter WHERE invalid reference to table \"%s\"",
+ relation->relname),
src/backend/replication/logical/tablesync.c
(6)
(i) Improve wording:
BEFORE:
/*
* Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication. This function also returns the relation
+ * qualifications to be used in COPY command.
*/
AFTER:
/*
- * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * Get information about a remote relation, in a similar fashion to
how the RELATION
+ * message provides information during replication. This function
also returns the relation
+ * qualifications to be used in the COPY command.
*/
(ii) fetch_remote_table_info() doesn't currently account for ALL
TABLES and ALL TABLES IN SCHEMA.
src/backend/replication/pgoutput/pgoutput.c
(7) pgoutput_tow_filter()
I think that the "ExecDropSingleTupleTableSlot(entry->scantuple);" is
not needed in pgoutput_tow_filter() - I don't think it can be non-NULL
when entry->exprstate_valid is false
(8) I am a little unsure about this "combine filters on copy
(irrespective of pubaction)" functionality. What if a filter is
specified and the only pubaction is DELETE?
0002
src/backend/catalog/pg_publication.c
(1) rowfilter_walker()
One of the errdetail messages doesn't begin with an uppercase letter:
+ errdetail_msg = _("user-defined types are not allowed");
src/backend/executor/execReplication.c
(2) CheckCmdReplicaIdentity()
Strictly speaking, the following:
+ if (invalid_rfcolnum)
should be:
+ if (invalid_rfcolnum != InvalidAttrNumber)
0003
src/backend/replication/logical/tablesync.c
(1)
Column name in comment should be "puballtables" not "puballtable":
+ * If any publication has puballtable true then all row-filtering is
(2) pgoutput_row_filter_init()
There should be a space before the final "*/" (so the asterisks align).
Also, should say "... treated the same".
/*
+ * If the publication is FOR ALL TABLES then it is treated same as if this
+ * table has no filters (even if for some other publication it does).
+ */
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Dec 15, 2021 at 10:20 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Dec 13, 2021 at 8:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v46* patch set.
0001
(1)
"If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher."Won't this lead to data inconsistencies or errors that otherwise
wouldn't happen?
How? The subscribers will get all the initial data.
Should such subscriptions be allowed?
I am not sure what you have in mind here? How can we change the
already released code pre-15 for this new feature?
--
With Regards,
Amit Kapila.
On Wed, Dec 15, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
"If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher."Won't this lead to data inconsistencies or errors that otherwise
wouldn't happen?How? The subscribers will get all the initial data.
But couldn't getting all the initial data (i.e. not filtering) break
the rules used by the old/new row processing (see v46-0003 patch)?
Those rules effectively assume rows have been previously published
with filtering.
So, for example, for the following case for UPDATE:
old-row (no match) new row (match) -> INSERT
the old-row check (no match) infers that the old row was never
published, but that row could in fact have been in the initial
unfiltered rows, so in that case an INSERT gets erroneously published
instead of an UPDATE, doesn't it?
Should such subscriptions be allowed?
I am not sure what you have in mind here? How can we change the
already released code pre-15 for this new feature?
I was thinking such subscription requests could be rejected by the
server, based on the subscriber version and whether the publications
use filtering etc.
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Dec 15, 2021 at 1:52 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Wed, Dec 15, 2021 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
"If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher."Won't this lead to data inconsistencies or errors that otherwise
wouldn't happen?How? The subscribers will get all the initial data.
But couldn't getting all the initial data (i.e. not filtering) break
the rules used by the old/new row processing (see v46-0003 patch)?
Those rules effectively assume rows have been previously published
with filtering.
So, for example, for the following case for UPDATE:
old-row (no match) new row (match) -> INSERT
the old-row check (no match) infers that the old row was never
published, but that row could in fact have been in the initial
unfiltered rows, so in that case an INSERT gets erroneously published
instead of an UPDATE, doesn't it?
But this can happen even when both the publisher and subscriber are
from v15, say if the user defines filter at some later point or change
the filter conditions by Alter Publication. So, not sure if we need to
invent something new for this.
Should such subscriptions be allowed?
I am not sure what you have in mind here? How can we change the
already released code pre-15 for this new feature?I was thinking such subscription requests could be rejected by the
server, based on the subscriber version and whether the publications
use filtering etc.
Normally, the client sends some parameters to the server like
(streaming, two_pc, etc.) based on which server can take such
decisions. We may need to include some such thing which I am not sure
is required for this particular case especially because that can
happen otherwise as well.
--
With Regards,
Amit Kapila.
On Mon, Dec 13, 2021 5:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v46* patch set.
Here are the main differences from v45:
0. Rebased to HEAD
1. Integrated many comments, docs, messages, code etc from Euler's patch
[Euler 6/12] 2. Several bugfixes 3. Patches are merged/added~~
Bugfix and Patch Merge details:
v46-0001 (main)
- Merged from v45-0001 (main) + v45-0005 (exprstate)
- Fix for mem leak reported by Greg (off-list)v46-0002 (validation)
- Merged from v45-0002 (node validation) + v45-0006 (replica identity
validation)v46-0003
- Rebased from v45-0003
- Fix for partition column order [Houz 9/12]
- Fix for core dump reported by Tang (off-list)v46-0004 (tab-complete and dump)
- Rebased from v45-0004v46-0005 (for all tables)
- New patch
- Fix for FOR ALL TABLES [Tang 7/12]
Thanks for updating the patch.
When reviewing the patch, I found the patch allows using system columns in
row filter expression.
---
create publication pub for table test WHERE ('(0,1)'::tid=ctid);
---
Since we can't create index on system column and most
existing expression feature(index expr,partition expr,table constr) doesn't
allow using system column, I think it might be better to disallow using system
column when creating or altering the publication. We can check like:
rowfilter_walker(Node *node, Relation relation)
...
if (var->varattno < 0)
ereport(ERROR,
errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
errmsg("cannot use system column \"%s\" in column generation expression",
...
Best regards,
Hou zj
Kindly do not change the mode of src/backend/parser/gram.y.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
PSA the v47* patch set.
Main differences from v46:
0. Rebased to HEAD
1. Addressed multiple review comments
~~
Details:
v47-0001 (main)
- Quick loop exit if no filter for all pubactions [Amit 14/12] #1
- Remove duplicated comment [Amit 14/12] #2
- Remove code block parens [Amit 14/12] #3
- GetTransformedWhereClause add function comment [Amit 14/12] #2,
[Greg 15/12] #3
- GetTransformedWhereClause change variable names [Amit 14/12] #3
- Commit comment wording [Greg 15/12] #2
- Fix incomplete comment [Greg 15/12] #4
- Wording of error message [Greg 15/12] #5
- Wording in tablesync comment [Greg 15/2] #6
- PG docs for FOR ALL TABLES
- Added regression tests for aggregate functions
v47-0002 (validation)
- Remove EXPR_KIND_PUBLICATION_WHERE [Amit 14/12] #1
- Refactor function for simple nodes [Amit 14/12] #4
- Fix case of error message [Greg 15/12] #1
- Cleanup code not using InvalidAttrNumber [Greg 15/12] #2
v47-0003 (new/old tuple)
- No change
v47-0004 (tab-complete and dump)
- No change
v47-0005 (for all tables)
- Fix comment in tablesync [Greg 15/12] #1
- Fix comment alignment [Greg 15/12] #2
- Add support for ALL TABLES IN SCHEMA [Amit 14/12]
- Use a unified SQL in the tablesync COPY [Amit 14/12]
- Quick loop exits if no filter for all pubactions [Amit 14/12] #1
- Added new TAP test case for FOR ALL TABLES
- Added new TAP test case for ALL TABLES IN SCHEMA
- Updated commit comment
------
[Amit 14/12] /messages/by-id/CAA4eK1JdLzJEmxxzEEYAOg41Om3Y88uL+7CgXdvnAaj7hkw8BQ@mail.gmail.com
[Amit 14/12] /messages/by-id/CAA4eK1+aiyjD4C1gohBZyZivrMruCE=9Mgmgtaq1gFvfRBU-wA@mail.gmail.com
[Greg 15/12] /messages/by-id/CAJcOf-dFo_kTroR2_k1x80TqN=-3oZC_2BGYe1O6e5JinrLKYg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v47-0003-Row-filter-updates-based-on-old-new-tuples.patchapplication/octet-stream; name=v47-0003-Row-filter-updates-based-on-old-new-tuples.patchDownload
From c5a26345d3a0ddaa1f2cb7420899d37dd91d2239 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 16 Dec 2021 11:28:17 +1100
Subject: [PATCH v47] Row-filter updates based on old/new tuples
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 38 +++--
src/backend/replication/pgoutput/pgoutput.c | 228 ++++++++++++++++++++++++----
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 235 insertions(+), 49 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..110ccff 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,13 +751,16 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
+ Datum attr_values[MaxTupleAttributeNumber];
+ bool attr_isnull[MaxTupleAttributeNumber];
desc = RelationGetDescr(rel);
@@ -771,7 +776,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (TupIsNull(slot))
+ {
+ values = attr_values;
+ isnull = attr_isnull;
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
@@ -832,6 +847,7 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
ReleaseSysCache(typtup);
}
+
}
/*
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6fbee17..15b1936 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -140,6 +142,9 @@ typedef struct RelationSyncEntry
ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
One per publication action. */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -174,11 +179,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ TupleTableSlot *slot, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -742,26 +751,124 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
- RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
- ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
- List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
- bool no_filter[] = {false, false, false}; /* One per pubaction */
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
changetype == REORDER_BUFFER_CHANGE_UPDATE ||
changetype == REORDER_BUFFER_CHANGE_DELETE);
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /* Clear the tuples */
+ ExecClearTuple(entry->old_tuple);
+ ExecClearTuple(entry->new_tuple);
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(changetype, relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ ListCell *lc;
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
/*
* If the row filter caching is currently flagged "invalid" then it means we
* don't know yet if there is/isn't any row filters for this relation.
@@ -928,11 +1035,34 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->exprstate_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
/* Bail out if there is no row filter */
if (!entry->exprstate[changetype])
@@ -951,7 +1081,12 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -964,7 +1099,6 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -1022,6 +1156,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -1029,10 +1166,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
- break;
-
/*
* Schema should be sent before the logic that replaces the
* relation because it also sends the ancestor's relation.
@@ -1050,6 +1183,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, NULL, tuple,
+ NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -1061,10 +1199,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
-
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
- break;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1085,9 +1220,34 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter */
+ if (!pgoutput_row_filter_update_check(change->action, relation,
+ oldtuple, newtuple, relentry,
+ &modified_action))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1096,10 +1256,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
- break;
-
maybe_send_schema(ctx, change, relation, relentry);
/* Switch relation if publishing via root. */
@@ -1113,6 +1269,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, oldtuple,
+ NULL, NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -1515,6 +1676,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 575969c..e8dc5ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2198,6 +2198,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v47-0004-Row-filter-tab-auto-complete-and-pgdump.patchapplication/octet-stream; name=v47-0004-Row-filter-tab-auto-complete-and-pgdump.patchDownload
From cbcfa8b347c2a75d3a6380df0faf28cc0c5587e7 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 16 Dec 2021 11:52:47 +1100
Subject: [PATCH v47] Row-filter tab auto-complete and pgdump
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 24 ++++++++++++++++++++++--
3 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 10a86f9..e595c7f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4265,6 +4265,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4275,9 +4276,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4286,6 +4294,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4326,6 +4335,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4393,8 +4406,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 6dccb4b..74f82cd 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -633,6 +633,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca..8b2d0fd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,19 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /*
+ * "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with
+ * table attributes
+ *
+ * "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,13 +2790,20 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
/*
+ * "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
+
+ /*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
* ..."
*/
--
1.8.3.1
v47-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v47-0001-Row-filter-for-logical-replication.patchDownload
From 7302fa8e730dda8eab1877151e16fbc9efed93b5 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Dec 2021 17:02:15 +1100
Subject: [PATCH v47] Row-filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row-filter is per table. A new row-filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. If the row-filter evaluates to NULL,
it returns false. The WHERE clause allows simple expressions. Simple expressions
cannot contain any aggregate or window functions, non-immutable functions,
user-defined types, operators or functions. This restriction could possibly be
addressed in the future.
If you choose to do the initial table synchronization, only data that satisfies
the row-filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expressions will be copied. If a subscriber is a
pre-15 version, the initial table synchronization won't use row-filters even
if they are defined in the publisher.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row-filter (if
the parameter is false, the default) or the root partitioned table row-filter.
Psql commands \dRp+ and \d+ will display any row-filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication row-filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Cache ExprState per pubaction.
If a subscriber has multiple publications and these publications include the
same table then there can be multiple filters that apply to that table.
These filters are stored per-pubactions of the publications. There are 4 kinds
of pubaction ("insert", "update", "delete", "truncate"), but row-filters are
not applied for "truncate".
Filters for the same pubaction are all combined (OR'ed) and cached as one, so
at the end there are at most 3 cached filters per table.
The appropriate (pubaction) filter is executed according to the DML operation.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1%2BhVXfOSScbf5LUB%3D5is%3DwYaC6NBhLxuvetbWQnZRnsVQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 37 ++-
doc/src/sgml/ref/create_subscription.sgml | 24 +-
src/backend/catalog/pg_publication.c | 69 ++++-
src/backend/commands/publicationcmds.c | 108 ++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/parser/parse_relation.c | 9 +
src/backend/replication/logical/tablesync.c | 118 +++++++-
src/backend/replication/pgoutput/pgoutput.c | 417 +++++++++++++++++++++++++++-
src/bin/psql/describe.c | 26 +-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 151 ++++++++++
src/test/regress/sql/publication.sql | 76 +++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++
24 files changed, 1458 insertions(+), 51 deletions(-)
mode change 100644 => 100755 src/backend/parser/gram.y
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537..2f1f913 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6311,6 +6311,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..5d9869c 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e..5aeee23 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -79,6 +79,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ If the optional <literal>WHERE</literal> clause is specified, only rows
+ that satisfy the <replaceable class="parameter">expression</replaceable>
+ will be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
+ </para>
+
+ <para>
Only persistent base tables and partitioned tables can be part of a
publication. Temporary tables, unlogged tables, foreign tables,
materialized views, and regular views cannot be part of a publication.
@@ -226,6 +233,22 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ A <literal>WHERE</literal> clause must contain only columns that are
+ part of the primary key or are covered by the <literal>REPLICA
+ IDENTITY</literal>, in order for <command>UPDATE</command> and
+ <command>DELETE</command> operations to be published.
+ For publication of <command>INSERT</command> operations, any column
+ may be used in the <literal>WHERE</literal> clause.
+ If nullable columns are present in the <literal>WHERE</literal> clause,
+ possible NULL values should be accounted for in expressions, to avoid
+ unexpected results, because <literal>NULL</literal> values can cause
+ those expressions to evaluate to false.
+ A <literal>WHERE</literal> clause allows simple expressions. The simple
+ expression cannot contain any aggregate or window functions, non-immutable
+ functions, user-defined types, operators or functions.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -247,6 +270,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -260,6 +288,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..db255f3 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ If the publications contain conditional expressions, it will affect
+ what data is copied. Refer to the
+ <xref linkend="sql-createsubscription-notes" /> for details.
+ </para>
</listitem>
</varlistentry>
@@ -293,7 +298,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</variablelist>
</refsect1>
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
<title>Notes</title>
<para>
@@ -319,6 +324,23 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published (i.e. they will be filtered out).
+ If the subscription has several publications in which the same table has been
+ published with different <literal>WHERE</literal> clauses, those expressions
+ (for the same publish operation) get OR'ed together so that rows satisfying any
+ of the expressions will be published. Also, if one of the publications for the
+ same table has no <literal>WHERE</literal> clause at all, or is a <literal>FOR
+ ALL TABLES</literal> or <literal>FOR ALL TABLES IN SCHEMA</literal> publication,
+ then all other <literal>WHERE</literal> clauses (for the same publish operation)
+ become redundant.
+ If the subscriber is a <productname>PostgreSQL</productname> version before 15
+ then any row filtering is ignored during the initial data synchronization phase.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 62f10bc..0929aa0 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -276,21 +279,54 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
}
/*
+ * Transform a publication WHERE clause, ensuring it is coerced to boolean and
+ * necessary collation information is added if required, and add a new
+ * nsitem/RTE for the associated relation to the ParseState's namespace list.
+ */
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool fixup_collation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock, NULL, false, false);
+
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate, copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (fixup_collation)
+ assign_expr_collations(pstate, whereclause);
+
+ return whereclause;
+}
+
+/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -311,10 +347,22 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /*
+ * Get the transformed WHERE clause, of boolean type, with necessary
+ * collation information.
+ */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -328,6 +376,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -344,6 +398,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 404bb5d..9ca743c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,96 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+ Node *oldrelwhereclause = NULL;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
+ if (oldrelwhereclause)
+ pfree(oldrelwhereclause);
- delrels = lappend(delrels, pubrel);
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
+ if (!found)
+ {
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +955,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +983,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1035,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1044,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1064,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1161,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df0b747..bd55ea6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4833,6 +4833,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index cb7ddd4..028b8e5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
old mode 100644
new mode 100755
index 3d4dd43..9da93a0
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9742,12 +9742,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9762,28 +9763,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17431,7 +17449,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17444,6 +17463,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index c5c3f26..036d9c6 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,11 +3538,20 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
+ {
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_TABLE),
+ errmsg("publication WHERE expression invalid reference to table \"%s\"",
+ relation->relname),
+ parser_errposition(pstate, relation->location)));
+
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
+ }
}
/*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..c20c221 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -687,20 +687,24 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
- * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * Get information about a remote relation, in a similar fashion to how the
+ * RELATION message provides information during replication. This function also
+ * returns the relation qualifications to be used in the COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row filter expressions for the same table will be combined
+ * by COPY using OR. If any of the filter expressions for this table are
+ * null, it means the whole table will be copied. In this case it is not
+ * necessary to construct a unified row filter expression at all.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ /*
+ * One entry without a row filter expression means clean up
+ * previous expressions (if there are any) and return with no
+ * expressions.
+ */
+ if (isnull)
+ {
+ if (*qual)
+ {
+ list_free_deep(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..6fbee17 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,24 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * ExprState cannot be used to indicate no cache, invalid cache and valid
+ * cache, so the flag exprstate_valid indicates if the current cache is
+ * valid.
+ *
+ * Multiple ExprState entries might be used if there are multiple
+ * publications for a single table. Different publication actions don't
+ * allow multiple expressions to always be combined into one, so there is
+ * one ExprSTate per publication action. Only 3 publication actions are used
+ * for row filtering ("insert", "update", "delete"). The exprstate array is
+ * indexed by ReorderBufferChangeType.
+ */
+ bool exprstate_valid;
+#define IDX_PUBACTION_n 3
+ ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
+ One per publication action. */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +163,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +172,14 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +655,323 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->exprstate_valid)
+ {
+ MemoryContext oldctx;
+ int idx;
+ bool found_filters = false;
+ int idx_ins = REORDER_BUFFER_CHANGE_INSERT;
+ int idx_upd = REORDER_BUFFER_CHANGE_UPDATE;
+ int idx_del = REORDER_BUFFER_CHANGE_DELETE;
+
+ /* Release the tuple table slot if it already exists. */
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple publications might have multiple row filters for this
+ * relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list (per
+ * pubaction). If no, then remember there was no filter for this pubaction.
+ * Code following this 'publications' loop will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ /* Gather the rfnodes per pubaction of this publiaction. */
+ if (pub->pubactions.pubinsert)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_ins] = lappend(rfnodes[idx_ins], rfnode);
+ }
+ if (pub->pubactions.pubupdate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_upd] = lappend(rfnodes[idx_upd], rfnode);
+ }
+ if (pub->pubactions.pubdelete)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_del] = lappend(rfnodes[idx_del], rfnode);
+ }
+ MemoryContextSwitchTo(oldctx);
+ }
+ else
+ {
+ /* Remember which pubactions have no row-filter. */
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ {
+ ReleaseSysCache(rftuple);
+ break;
+ }
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Now all the filters for all pubactions are known. Combine them when
+ * their pubactions are same.
+ *
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because all
+ * expressions are aggregated by the OR operator. The row filter absence
+ * means replicate all rows so a single valid expression means publish
+ * this row.
+ */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ int n_filters;
+
+ if (no_filter[idx])
+ {
+ if (rfnodes[idx])
+ {
+ list_free_deep(rfnodes[idx]);
+ rfnodes[idx] = NIL;
+ }
+ }
+
+ /*
+ * If there was one or more filter for this pubaction then combine them
+ * (if necessary) and cache the ExprState.
+ */
+ n_filters = list_length(rfnodes[idx]);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) : linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ found_filters = true; /* flag that we will need slots made */
+ }
+ } /* for each pubaction */
+
+ if (found_filters)
+ {
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->exprstate_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate (for this pubaction).
+ */
+ if (entry->exprstate[changetype])
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[changetype], ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +998,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +1022,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +1029,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +1062,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1096,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1165,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1487,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1511,13 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->exprstate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1622,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1310,6 +1684,7 @@ static void
rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ int idx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1354,6 +1729,25 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->exprstate_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ /* Cleanup the ExprState for each of the pubactions. */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
+ }
}
}
@@ -1365,6 +1759,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1769,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1789,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 72d8547..7057828 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3150,17 +3150,21 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
"WHERE pr.prrelid = '%s'\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -3196,6 +3200,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ /* row filter (if any) */
+ if (pset.sversion >= 150000)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE %s", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -6320,8 +6331,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE %s", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -6450,8 +6465,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 902f2f2..96c55f6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -123,13 +125,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
extern bool is_publishable_relation(Relation rel);
extern bool is_schema_publication(Oid pubid);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4c5a8a3..e437a55 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3642,6 +3642,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5ac2d66..6bf0bd7 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,157 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub_dplus_rf_no"
+ "testpub_dplus_rf_yes" WHERE (a > 1)
+
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (e < 999)
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - aggregate functions not allowed in WHERE clause
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...TION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+ ^
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 56dd358..4b5ce05 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,82 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - aggregate functions not allowed in WHERE clause
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v47-0005-Row-filter-handle-FOR-ALL-TABLES.patchapplication/octet-stream; name=v47-0005-Row-filter-handle-FOR-ALL-TABLES.patchDownload
From e0a13715260ad5e4cb819287eaa01d469f11a6db Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 16 Dec 2021 20:12:53 +1100
Subject: [PATCH v47] Row-filter handle FOR ALL TABLES
If one of the subscriber's publications was created using FOR ALL TABLES then
that implies NO row-filtering will be applied.
If one of the subscriber's publications was created using FOR ALL TABLES IN
SCHEMA and the table belong to that same schmea, then that also implies NO
row-filtering will be applied.
These rules overrides any other row-filters from other subscribed publications.
Note that the initial COPY does not take publication operations into account.
Author: Peter Smith
Reported By: Tang
Discussion: https://www.postgresql.org/message-id/OS0PR01MB6113D82113AA081ACF710D0CFB6E9%40OS0PR01MB6113.jpnprd01.prod.outlook.com
---
src/backend/replication/logical/tablesync.c | 48 +++++++----
src/backend/replication/pgoutput/pgoutput.c | 63 ++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 118 ++++++++++++++++++++++++++--
3 files changed, 202 insertions(+), 27 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index c20c221..469aadc 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -802,21 +802,22 @@ fetch_remote_table_info(char *nspname, char *relname,
walrcv_clear_result(res);
/*
+ * If any publication has puballtables true then all row-filtering is
+ * ignored.
+ *
+ * If the relation is a member of a schema of a subscribed publication that
+ * said ALL TABLES IN SCHEMA then all row-filtering is ignored.
+ *
* Get relation qual. DISTINCT avoids the same expression of a table in
* multiple publications from being included multiple times in the final
* expression.
*/
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
{
- resetStringInfo(&cmd);
- appendStringInfo(&cmd,
- "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
- " FROM pg_publication p "
- " INNER JOIN pg_publication_rel pr "
- " ON (p.oid = pr.prpubid) "
- " WHERE pr.prrelid = %u "
- " AND p.pubname IN (", lrel->remoteid);
+ StringInfoData pub_names;
+ /* Build the pubname list. */
+ initStringInfo(&pub_names);
first = true;
foreach(lc, MySubscription->publications)
{
@@ -825,11 +826,28 @@ fetch_remote_table_info(char *nspname, char *relname,
if (first)
first = false;
else
- appendStringInfoString(&cmd, ", ");
+ appendStringInfoString(&pub_names, ", ");
- appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ appendStringInfoString(&pub_names, quote_literal_cstr(pubname));
}
- appendStringInfoChar(&cmd, ')');
+
+ /* Check for row-filters */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u AND p.pubname IN ( %s ) "
+ " AND NOT (select bool_or(puballtables) "
+ " FROM pg_publication "
+ " WHERE pubname in ( %s )) "
+ " AND (SELECT count(1)=0 "
+ " FROM pg_publication_namespace pn, pg_class c "
+ " WHERE c.oid = %u AND c.relnamespace = pn.pnnspid)",
+ lrel->remoteid,
+ pub_names.data,
+ pub_names.data,
+ lrel->remoteid);
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
@@ -847,18 +865,14 @@ fetch_remote_table_info(char *nspname, char *relname,
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- Datum rf = slot_getattr(slot, 1, &isnull);
+ Datum rf = slot_getattr(slot, 1, &isnull);
if (!isnull)
*qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
ExecClearTuple(slot);
- /*
- * One entry without a row filter expression means clean up
- * previous expressions (if there are any) and return with no
- * expressions.
- */
+ /* Ignore filters and cleanup as necessary. */
if (isnull)
{
if (*qual)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 15b1936..327c7e5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -919,13 +919,68 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
* relation. Since row filter usage depends on the DML operation,
* there are multiple lists (one for each operation) which row filters
* will be appended.
+ *
+ * NOTE: FOR ALL TABLES implies "use no filters" so it takes precedence
+ *
+ * NOTE: ALL TABLES IN SCHEMA also implies "use not filters" if the
+ * table is a member of the same schema.
*/
foreach(lc, data->publications)
{
- Publication *pub = lfirst(lc);
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ List *schemarelids = NIL;
+
+ /*
+ * If the publication is FOR ALL TABLES then it is treated the same
+ * as if this table has no filters (even if for some other
+ * publication it does).
+ */
+ if (pub->alltables)
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;
+ }
+
+ /*
+ * If the publication is FOR ALL TABLES IN SCHEMA and it overlaps with the
+ * current relation in the same schema then this is also treated same as if
+ * this table has no filters (even if for some other publication it does).
+ */
+ schemarelids = GetAllSchemaPublicationRelations(pub->oid,
+ pub->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ if (list_member_oid(schemarelids, entry->relid))
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ list_free(schemarelids);
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;
+ }
+ list_free(schemarelids);
/*
* Lookup if there is a row-filter, and if yes remember it in a list (per
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index a2f25f6..73add45 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 10;
+use Test::More tests => 14;
# create publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
@@ -15,6 +15,116 @@ my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init(allows_streaming => 'logical');
$node_subscriber->start;
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+
+# ====================================================================
+# Testcase start: FOR ALL TABLES
+#
+# The FOR ALL TABLES test must come first so that it is not affected by
+# all the other test tables that are later created.
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres', "CREATE TABLE tab_rf_x (x int primary key)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_rf_x (x int primary key)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rf_x (x) VALUES (0), (5), (10), (15), (20)");
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_x FOR TABLE tab_rf_x WHERE (x > 10)");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_forall FOR ALL TABLES");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_x, tap_pub_forall");
+
+$node_publisher->wait_for_catchup($appname);
+# wait for initial table synchronization to finish
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# The subscription of the FOR ALL TABLES publication means there should be no
+# filtering on the tablesync COPY, so all expect all 5 will be present.
+my $result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_forall");
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_x");
+$node_publisher->safe_psql('postgres', "DROP TABLE tab_rf_x");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
+$node_subscriber->safe_psql('postgres', "DROP TABLE tab_rf_x");
+
+# Testcase end: FOR ALL TABLES
+# ====================================================================
+
+# ====================================================================
+# Testcase start: ALL TABLES IN SCHEMA
+#
+# The ALL TABLES IN SCHEMA test is independent of all other test cases so it
+# cleans up after itself.
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres', "CREATE SCHEMA schema_rf_x");
+$node_publisher->safe_psql('postgres', "CREATE TABLE schema_rf_x.tab_rf_x (x int primary key)");
+$node_subscriber->safe_psql('postgres', "CREATE SCHEMA schema_rf_x");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE schema_rf_x.tab_rf_x (x int primary key)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO schema_rf_x.tab_rf_x (x) VALUES (0), (5), (10), (15), (20)");
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_x FOR TABLE schema_rf_x.tab_rf_x WHERE (x > 10)");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_allinschema FOR ALL TABLES IN SCHEMA schema_rf_x");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_x, tap_pub_allinschema");
+
+$node_publisher->wait_for_catchup($appname);
+# wait for initial table synchronization to finish
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# The subscription of the ALL TABLES IN SCHEMA publication means there should be
+# no filtering on the tablesync COPY, so all expect all 5 will be present.
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM schema_rf_x.tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO schema_rf_x.tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM schema_rf_x.tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_allinschema");
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_x");
+$node_publisher->safe_psql('postgres', "DROP TABLE schema_rf_x.tab_rf_x");
+$node_publisher->safe_psql('postgres', "DROP SCHEMA schema_rf_x");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
+$node_subscriber->safe_psql('postgres', "DROP TABLE schema_rf_x.tab_rf_x");
+$node_subscriber->safe_psql('postgres', "DROP SCHEMA schema_rf_x");
+
+# Testcase end: ALL TABLES IN SCHEMA
+# ====================================================================
+
# setup structure on publisher
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
@@ -127,8 +237,6 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
-my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
-my $appname = 'tap_sub';
$node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
);
@@ -136,8 +244,6 @@ $node_subscriber->safe_psql('postgres',
$node_publisher->wait_for_catchup($appname);
# wait for initial table synchronization to finish
-my $synced_query =
- "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
$node_subscriber->poll_query_until('postgres', $synced_query)
or die "Timed out while waiting for subscriber to synchronize data";
@@ -148,7 +254,7 @@ $node_subscriber->poll_query_until('postgres', $synced_query)
# - INSERT (1980, 'not filtered') YES
# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
#
-my $result =
+$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is( $result, qq(1001|test 1001
--
1.8.3.1
v47-0002-Row-filter-validation.patchapplication/octet-stream; name=v47-0002-Row-filter-validation.patchDownload
From 08972f2025f55082ab73eec51343c595ea2fe4e0 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 15 Dec 2021 18:35:04 +1100
Subject: [PATCH v47] Row-filter validation
This patch implements parse-tree "walkers" to validate a row-filter.
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifically:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system functions (unless they are immutable). See design decision at [1].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
Permits only simple nodes including:
List, Const, BoolExpr, NullIfExpr, NullTest, BooleanTest, CoalesceExpr,
CaseExpr, CaseTestExpr, MinMaxExpr, ArrayExpr, ScalarArrayOpExpr, XmlExpr.
Author: Peter Smith, Euler Taveira
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" "update", validate that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Row filter columns invalidation is done in CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on
the published relation. This is consistent with the existing check about
replica identity and can detect the change related to the row filter in time.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It is safe to do this because every
operation that change the row filter and replica identity will invalidate the
relcache.
Author: Hou zj
---
src/backend/catalog/pg_publication.c | 136 +++++++++++++++-
src/backend/executor/execReplication.c | 36 +++-
src/backend/parser/parse_agg.c | 10 --
src/backend/parser/parse_expr.c | 21 +--
src/backend/parser/parse_func.c | 3 -
src/backend/parser/parse_oper.c | 7 -
src/backend/parser/parse_relation.c | 9 -
src/backend/utils/cache/relcache.c | 262 ++++++++++++++++++++++++++----
src/include/parser/parse_node.h | 1 -
src/include/utils/rel.h | 7 +
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 225 ++++++++++++++++++++-----
src/test/regress/sql/publication.sql | 174 +++++++++++++++++---
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
15 files changed, 742 insertions(+), 158 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 0929aa0..d0d58e4 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -112,6 +114,127 @@ check_publication_add_schema(Oid schemaid)
}
/*
+ * Is this a simple Node permitted within a row filter expression?
+ */
+static bool
+IsRowFilterSimpleNode(Node *node)
+{
+ switch (nodeTag(node))
+ {
+ case T_ArrayExpr:
+ case T_BooleanTest:
+ case T_BoolExpr:
+ case T_CaseExpr:
+ case T_CaseTestExpr:
+ case T_CoalesceExpr:
+ case T_Const:
+ case T_List:
+ case T_MinMaxExpr:
+ case T_NullIfExpr:
+ case T_NullTest:
+ case T_ScalarArrayOpExpr:
+ case T_XmlExpr:
+ return true;
+ default:
+ return false;
+ }
+}
+
+/*
+ * The row filter walker checks if the row filter expression is a "simple
+ * expression".
+ *
+ * It allows only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - Non-immutable builtin functions are not allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-definition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because
+ * non-immutable functions can access the database and would lead to the problem
+ * (b) mentioned in the previous paragraph.
+ */
+static bool
+rowfilter_walker(Node *node, Relation relation)
+{
+ char *errdetail_msg = NULL;
+
+ if (node == NULL)
+ return false;
+
+
+ if (IsRowFilterSimpleNode(node))
+ {
+ /* OK, node is part of simple expressions */
+ }
+ else if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined types are not allowed");
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *) node)->opno >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined operators are not allowed.");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *) node)->funcid;
+ const char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ errdetail_msg = psprintf(_("User-defined functions are not allowed (%s)."),
+ funcname);
+ else if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ errdetail_msg = psprintf(_("Non-immutable built-in functions are not allowed (%s)."),
+ funcname);
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("Expressions only allow columns, constants and some built-in functions and operators.")
+ ));
+ }
+
+ if (errdetail_msg)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("%s", errdetail_msg)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)relation);
+}
+
+/*
* Returns if relation represented by oid and Form_pg_class entry
* is publishable.
*
@@ -241,10 +364,6 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
- */
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -298,7 +417,7 @@ GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
addNSItemToQuery(pstate, nsitem, false, true, true);
whereclause = transformWhereClause(pstate, copyObject(pri->whereClause),
- EXPR_KIND_PUBLICATION_WHERE,
+ EXPR_KIND_WHERE,
"PUBLICATION WHERE");
/* Fix up collation information */
@@ -362,6 +481,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
* collation information.
*/
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d2..c175954 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -568,14 +568,46 @@ void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
PublicationActions *pubactions;
+ AttrNumber bad_rfcolnum;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ return;
+
+ bad_rfcolnum = RelationGetInvalidRowFilterCol(rel);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns referenced in
+ * the row filters from publications which the relation is in are valid,
+ * which means all referenced columns are part of REPLICA IDENTITY, or the
+ * table do not publish UPDATES or DELETES.
+ */
+ if (AttributeNumberIsValid(bad_rfcolnum))
+ {
+ const char *colname = get_attname(RelationGetRelid(rel),
+ bad_rfcolnum, false);
+
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ }
+
/* If relation has replica identity we are always good. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
- OidIsValid(RelationGetReplicaIndex(rel)))
+ if (OidIsValid(RelationGetReplicaIndex(rel)))
return;
/*
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..7d829a0 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,13 +551,6 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
- break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,9 +943,6 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
- break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..2d1a477 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -515,7 +504,6 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
- case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1776,9 +1764,6 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
- break;
/*
* There is intentionally no default: case here, so that the
@@ -3099,8 +3084,6 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
- case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..542f916 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,9 +2655,6 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
- break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 036d9c6..c5c3f26 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,20 +3538,11 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
- {
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_UNDEFINED_TABLE),
- errmsg("publication WHERE expression invalid reference to table \"%s\"",
- relation->relname),
- parser_errposition(pstate, relation->location)));
-
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
- }
}
/*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 105d8d4..ed04881 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -71,6 +72,8 @@
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_relation.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rowsecurity.h"
#include "storage/lmgr.h"
@@ -84,6 +87,7 @@
#include "utils/memutils.h"
#include "utils/relmapper.h"
#include "utils/resowner_private.h"
+#include "utils/ruleutils.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
@@ -5521,57 +5525,169 @@ RelationGetExclusionInfo(Relation indexRelation,
MemoryContextSwitchTo(oldcxt);
}
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcolnum; /* invalid column number */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+ bool pubviaroot; /* true if we are validating the parent
+ * relation's row filter */
+ Oid relid; /* relid of the relation */
+ Oid parentid; /* relid of the parent relation */
+} rf_context;
+
/*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row-filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcolnum.
*/
-struct PublicationActions *
-GetRelationPublicationActions(Relation relation)
+static bool
+invalid_rowfilter_column_walker(Node *node, rf_context *context)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ /*
+ * If pubviaroot is true, we need to convert the column number of
+ * parent to the column number of child relation first.
+ */
+ if (context->pubviaroot)
+ {
+ char *colname = get_attname(context->parentid, attnum, false);
+ attnum = get_attnum(context->relid, colname);
+ }
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ context->bms_replident))
+ {
+ context->invalid_rfcolnum = attnum;
+ return true;
+ }
+ }
+
+ return expression_tree_walker(node, invalid_rowfilter_column_walker,
+ (void *) context);
+}
+
+/*
+ * Append to cur_puboids each member of add_puboids that isn't already in
+ * cur_puboids.
+ *
+ * Also update the top most parent relation's relid in the publication.
+ */
+static void
+concat_publication_oid(Oid relid,
+ List **cur_puboids,
+ List **toprelid_in_pub,
+ const List *add_puboids)
+{
+ ListCell *lc1,
+ *lc2,
+ *lc3;
+
+ foreach(lc1, add_puboids)
+ {
+ bool is_member = false;
+
+ forboth(lc2, *cur_puboids, lc3, *toprelid_in_pub)
+ {
+ if (lfirst_oid(lc2) == lfirst_oid(lc1))
+ {
+ is_member = true;
+ lfirst_oid(lc3) = relid;
+ }
+ }
+
+ if (!is_member)
+ {
+ *cur_puboids = lappend_oid(*cur_puboids, lfirst_oid(lc1));
+ *toprelid_in_pub = lappend_oid(*toprelid_in_pub, relid);
+ }
+ }
+}
+
+/*
+ * Get the invalid row filter column number for the given relation.
+ *
+ * Traverse all the publications which the relation is in to get the
+ * publication actions. If the publication actions include UPDATE or DELETE,
+ * then validate that if all columns referenced in the row filter expression
+ * are part of REPLICA IDENTITY.
+ *
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, otherwise InvalidAttrNumber.
+ */
+AttrNumber
+RelationGetInvalidRowFilterCol(Relation relation)
+{
+ List *puboids,
+ *toprelid_in_pub;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ Oid relid = RelationGetRelid(relation);
+ rf_context context = { 0 };
+ PublicationActions pubactions = { 0 };
+ bool rfcol_valid = true;
+ AttrNumber invalid_rfcolnum = InvalidAttrNumber;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
- if (!is_publishable_relation(relation))
- return pubactions;
-
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (!is_publishable_relation(relation) || relation->rd_rfcol_valid)
+ return invalid_rfcolnum;
/* Fetch the publication membership info. */
- puboids = GetRelationPublications(RelationGetRelid(relation));
+ toprelid_in_pub = puboids = NIL;
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetRelationPublications(relid));
schemaid = RelationGetNamespace(relation);
- puboids = list_concat_unique_oid(puboids, GetSchemaPublications(schemaid));
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
if (relation->rd_rel->relispartition)
{
/* Add publications that the ancestors are in too. */
- List *ancestors = get_partition_ancestors(RelationGetRelid(relation));
+ List *ancestors = get_partition_ancestors(relid);
ListCell *lc;
foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
- puboids = list_concat_unique_oid(puboids,
- GetRelationPublications(ancestor));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetRelationPublications(ancestor));
schemaid = get_rel_namespace(ancestor);
- puboids = list_concat_unique_oid(puboids,
- GetSchemaPublications(schemaid));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
}
+
+ relid = llast_oid(ancestors);
}
- puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetAllTablesPublications());
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTITY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5581,35 +5697,116 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubactions.pubinsert |= pubform->pubinsert;
+ pubactions.pubupdate |= pubform->pubupdate;
+ pubactions.pubdelete |= pubform->pubdelete;
+ pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UPDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ *
+ * If we already found the column in row filter which is not part
+ * of REPLICA IDENTITY index, skip the validation too.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL &&
+ rfcol_valid)
+ {
+ HeapTuple rftuple;
+
+ if (pubform->pubviaroot)
+ relid = list_nth_oid(toprelid_in_pub,
+ foreach_current_index(lc));
+ else
+ relid = RelationGetRelid(relation);
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ context.pubviaroot = pubform->pubviaroot;
+ context.parentid = relid;
+ context.relid = RelationGetRelid(relation);
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !invalid_rowfilter_column_walker(rfnode,
+ &context);
+ invalid_rfcolnum = context.invalid_rfcolnum;
+ pfree(rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
+ */
+ if (pubactions.pubinsert && pubactions.pubupdate &&
+ pubactions.pubdelete && pubactions.pubtruncate &&
+ !rfcol_valid)
break;
}
+ bms_free(context.bms_replident);
+
if (relation->rd_pubactions)
{
pfree(relation->rd_pubactions);
relation->rd_pubactions = NULL;
}
+ relation->rd_rfcol_valid = rfcol_valid;
+
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ memcpy(relation->rd_pubactions, &pubactions, sizeof(PublicationActions));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return invalid_rfcolnum;
+}
+
+/*
+ * Get publication actions for the given relation.
+ */
+struct PublicationActions *
+GetRelationPublicationActions(Relation relation)
+{
+ PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+
+ /*
+ * If not publishable, it publishes no actions. (pgoutput_change() will
+ * ignore it.)
+ */
+ if (!is_publishable_relation(relation))
+ return pubactions;
+
+ if (!relation->rd_pubactions)
+ (void) RelationGetInvalidRowFilterCol(relation);
+
+ return memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
}
/*
@@ -6163,6 +6360,7 @@ load_relcache_init_file(bool shared)
rel->rd_idattr = NULL;
rel->rd_hotblockingattr = NULL;
rel->rd_pubactions = NULL;
+ rel->rd_rfcol_valid = false;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index d58ae6a..ee17908 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,7 +80,6 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
- EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 3128127..27cec81 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -164,6 +164,13 @@ typedef struct RelationData
PublicationActions *rd_pubactions; /* publication actions */
/*
+ * true if the columns referenced in row filters from all the publications
+ * the relation is in are part of replica identity, or the publication
+ * actions do not include UPDATE and DELETE.
+ */
+ bool rd_rfcol_valid;
+
+ /*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
* defaults".
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 82316bb..25c759f 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern AttrNumber RelationGetInvalidRowFilterCol(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 6bf0bd7..80c0c6d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -243,18 +243,21 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -264,7 +267,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -275,7 +278,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
@@ -286,7 +289,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
@@ -308,43 +311,43 @@ Publications:
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (e < 999)
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
- "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+ "testpub_rf_schema1.testpub_rf_tbl5" WHERE (h < 999)
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
ERROR: syntax error at or near "WHERE"
-LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+LINE 1: ...ntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a =...
^
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
ERROR: WHERE clause for schema not allowed
-LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf...
^
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
@@ -353,43 +356,181 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
-- fail - aggregate functions not allowed in WHERE clause
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
-ERROR: functions are not allowed in publication WHERE expressions
+ERROR: aggregate functions are not allowed in WHERE
LINE 1: ...TION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
^
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: User-defined operators are not allowed.
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: User-defined functions are not allowed (testpub_rf_func2).
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Non-immutable built-in functions are not allowed (random).
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: User-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Expressions only allow columns, constants and some built-in functions and operators.
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
-ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tb16" to publication
-DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
+ERROR: cannot add relation "testpub_rf_schema2.testpub_rf_tb16" to publication
+DETAIL: Table's schema "testpub_rf_schema2" is already part of the publication or part of the specified schema list.
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "d" used in the publication WHERE expression is not part of the replica identity.
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_part_pk_1"
+DETAIL: Column "b" used in the publication WHERE expression is not part of the replica identity.
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 4b5ce05..464b3ae 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -138,12 +138,15 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tb16(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tb16 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -162,53 +165,176 @@ RESET client_min_messages;
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
RESET client_min_messages;
-- fail - aggregate functions not allowed in WHERE clause
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub7 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tb16;
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tb16;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tb16;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
DROP PUBLICATION testpub7;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f41ef0d..575969c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3501,6 +3501,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
On Fri, Dec 17, 2021 at 7:11 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Kindly do not change the mode of src/backend/parser/gram.y.
Oops. Sorry that was not deliberate.
I will correct that in the next version.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Dec 15, 2021 at 3:50 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Dec 13, 2021 at 8:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v46* patch set.
0001
...
(2) In the 0001 patch comment, the term "publication filter" is used
in one place, and in others "row filter" or "row-filter".
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
src/backend/catalog/pg_publication.c
(3) GetTransformedWhereClause() is missing a function comment.
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
(4)
The following comment seems incomplete:+ /* Fix up collation information */ + whereclause = GetTransformedWhereClause(pstate, pri, true);
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
src/backend/parser/parse_relation.c
(5)
wording? consistent?
Shouldn't it be "publication WHERE expression" for consistency?
In v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com this message is removed when the KIND is removed.
+ errmsg("publication row-filter WHERE invalid reference to table \"%s\"", + relation->relname),src/backend/replication/logical/tablesync.c
(6)(i) Improve wording:
BEFORE: /* * Get information about remote relation in similar fashion the RELATION - * message provides during replication. + * message provides during replication. This function also returns the relation + * qualifications to be used in COPY command. */AFTER: /* - * Get information about remote relation in similar fashion the RELATION - * message provides during replication. + * Get information about a remote relation, in a similar fashion to how the RELATION + * message provides information during replication. This function also returns the relation + * qualifications to be used in the COPY command. */
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
(ii) fetch_remote_table_info() doesn't currently account for ALL
TABLES and ALL TABLES IN SCHEMA.
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
...
0002
src/backend/catalog/pg_publication.c
(1) rowfilter_walker()
One of the errdetail messages doesn't begin with an uppercase letter:+ errdetail_msg = _("user-defined types are not allowed");
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
src/backend/executor/execReplication.c
(2) CheckCmdReplicaIdentity()Strictly speaking, the following:
+ if (invalid_rfcolnum)
should be:
+ if (invalid_rfcolnum != InvalidAttrNumber)
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
0003
src/backend/replication/logical/tablesync.c
(1)
Column name in comment should be "puballtables" not "puballtable":+ * If any publication has puballtable true then all row-filtering is
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
(2) pgoutput_row_filter_init()
There should be a space before the final "*/" (so the asterisks align).
Also, should say "... treated the same"./* + * If the publication is FOR ALL TABLES then it is treated same as if this + * table has no filters (even if for some other publication it does). + */
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Dec 14, 2021 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 4:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Dec 7, 2021 at 5:48 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?But for now, the filter(a > 10) works both when copying initial data and later changes.
To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.Partly fixed in v46-0005 [1]
NOTE
- The initial COPY part of the tablesync does not take the publish
operation into account so it means that if any of the subscribed
publications have "puballtables" flag then all data will be copied
sans filters.I think this should be okay but the way you have implemented it in the
patch doesn't appear to be the optimal way. Can't we fetch
allpubtables info and qual info as part of one query instead of using
separate queries?
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com. Now code uses a unified SQL query provided by Vignesh.
I guess this is consistent with the other decision to
ignore publication operations [2].TODO
- Documentation
- IIUC there is a similar case yet to be addressed - FOR ALL TABLES IN SCHEMAYeah, "FOR ALL TABLES IN SCHEMA" should also be addressed. In this
case, the difference would be that we need to check the presence of
schema corresponding to the table (for which we are fetching
row_filter information) is there in pg_publication_namespace. If it
exists then we don't need to apply row_filter for the table. I feel it
is better to fetch all this information as part of the query which you
are using to fetch row_filter info. The idea is to avoid the extra
round-trip between subscriber and publisher.
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com. Added code and TAP test case for ALL TABLES IN SCHEMA.
Few other comments:
===================
1.
@@ -926,6 +928,22 @@ pgoutput_row_filter_init(PGOutputData *data,
Relation relation, RelationSyncEntr
bool rfisnull;/* + * If the publication is FOR ALL TABLES then it is treated same as if this + * table has no filters (even if for some other publication it does). + */ + if (pub->alltables) + { + if (pub->pubactions.pubinsert) + no_filter[idx_ins] = true; + if (pub->pubactions.pubupdate) + no_filter[idx_upd] = true; + if (pub->pubactions.pubdelete) + no_filter[idx_del] = true; + + continue; + }Is there a reason to continue checking the other publications if
no_filter is true for all kind of pubactions?
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com.
2. + * All row filter expressions will be discarded if there is one + * publication-relation entry without a row filter. That's because + * all expressions are aggregated by the OR operator. The row + * filter absence means replicate all rows so a single valid + * expression means publish this row.This same comment is at two places, remove from one of the places. I
think keeping it atop for loop is better.
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
3. + { + int idx; + bool found_filters = false;I am not sure if starting such ad-hoc braces in the code to localize
the scope of variables is a regular practice. Can we please remove
this?
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Dec 14, 2021 at 10:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 10:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 4:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
Few other comments:
===================Few more comments:
==================
v46-0001/0002
===============
1. After rowfilter_walker() why do we need
EXPR_KIND_PUBLICATION_WHERE? I thought this is primarily to identify
the expressions that are not allowed in rowfilter which we are now
able to detect upfront with the help of a walker. Can't we instead use
EXPR_KIND_WHERE?
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
2. +Node * +GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri, + bool bfixupcollation)Can we add comments atop this function?
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
3. In GetTransformedWhereClause, can we change the name of variables
(a) bfixupcollation to fixup_collation or assign_collation, (b)
transformedwhereclause to whereclause. I think that will make the
function more readable.
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
v46-0002 ======== 4. + else if (IsA(node, List) || IsA(node, Const) || IsA(node, BoolExpr) || IsA(node, NullIfExpr) || + IsA(node, NullTest) || IsA(node, BooleanTest) || IsA(node, CoalesceExpr) || + IsA(node, CaseExpr) || IsA(node, CaseTestExpr) || IsA(node, MinMaxExpr) || + IsA(node, ArrayExpr) || IsA(node, ScalarArrayOpExpr) || IsA(node, XmlExpr))Can we move this to a separate function say IsValidRowFilterExpr() or
something on those lines and use Switch (nodetag(node)) to identify
these nodes?
Fixed in v47 [1]/messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
------
[1]: /messages/by-id/CAHut+Ptjsj_OVMWEdYp2Wq19=H5D4Vgta43FbFVDYr2LuS_djg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Dec 17, 2021 at 9:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v47* patch set.
I found that even though there are now separately-maintained WHERE clauses
per pubaction, there still seem to be problems when applying the old/new
row rules for UPDATE.
A simple example of this was previously discussed in [1]/messages/by-id/CAJcOf-dz0srExG0NPPgXh5X8eL2uxk7C=cZoGTbf8cNqoRUY6w@mail.gmail.com.
The example is repeated below:
---- Publication
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
---- Subscription
create table tbl1 (a int primary key, b int);
create subscription sub connection 'dbname=postgres host=localhost
port=10000' publication A,B;
---- Publication
insert into tbl1 values (1,1);
update tbl1 set a = 2;
So using the v47 patch-set, I still find that the UPDATE above results in
publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
This is because the old row (1,1) doesn't match the UPDATE filter "(a>1)",
but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be
assumed that (1,1) was never published (and thus requires an INSERT rather
than UPDATE) based on these checks, because in this example, (1,1) was
previously published via a different operation - INSERT (and using a
different filter too).
I think the fundamental problem here is that these UPDATE rules assume that
the old (current) row was previously UPDATEd (and published, or not
published, according to the filter applicable to UPDATE), but this is not
necessarily the case.
Or am I missing something?
----
[1]: /messages/by-id/CAJcOf-dz0srExG0NPPgXh5X8eL2uxk7C=cZoGTbf8cNqoRUY6w@mail.gmail.com
/messages/by-id/CAJcOf-dz0srExG0NPPgXh5X8eL2uxk7C=cZoGTbf8cNqoRUY6w@mail.gmail.com
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
So using the v47 patch-set, I still find that the UPDATE above results in publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.+ * old-row (no match) new-row (no match) -> (drop change) + * old-row (no match) new row (match) -> INSERT + * old-row (match) new-row (no match) -> DELETE + * old-row (match) new row (match) -> UPDATEThis is because the old row (1,1) doesn't match the UPDATE filter "(a>1)", but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be assumed that (1,1) was never published (and thus requires an INSERT rather than UPDATE) based on these checks, because in this example, (1,1) was previously published via a different operation - INSERT (and using a different filter too).
I think the fundamental problem here is that these UPDATE rules assume that the old (current) row was previously UPDATEd (and published, or not published, according to the filter applicable to UPDATE), but this is not necessarily the case.
Or am I missing something?
But it need not be correct in assuming that the old-row was part of a
previous INSERT either (and published, or not published according to
the filter applicable to an INSERT).
For example, change the sequence of inserts and updates prior to the
last update:
truncate tbl1 ;
insert into tbl1 values (1,5); ==> not replicated since insert and ! (b < 2);
update tbl1 set b = 1; ==> not replicated since update and ! (a > 1)
update tbl1 set a = 2; ==> replicated and update converted to insert
since (a > 1)
In this case, the last update "update tbl1 set a = 2; " is updating a
row that was previously updated and not inserted and not replicated to
the subscriber.
How does the replication logic differentiate between these two cases,
and decide if the update was previously published or not?
I think it's futile for the publisher side to try and figure out the
history of published rows. In fact, if this level of logic is required
then it is best implemented on the subscriber side, which then defeats
the purpose of a publication filter.
regards,
Ajin Cherian
Fujitsu Australia
On Fri, Dec 17, 2021 at 4:11 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v47* patch set.
Few comments on v47-0002:
=======================
1. The handling to find rowfilter for ancestors in
RelationGetInvalidRowFilterCol seems complex. It seems you are
accumulating non-partition relations as well in toprelid_in_pub. Can
we simplify such that we find the ancestor only for 'pubviaroot'
publications?
2. I think the name RelationGetInvalidRowFilterCol is confusing
because the same function is also used to get publication actions. Can
we name it as GetRelationPublicationInfo() and pass a bool parameter
to indicate whether row_filter info needs to be built. We can get the
invalid_row_filter column as output from that function.
3.
+GetRelationPublicationActions(Relation relation)
{
..
+ if (!relation->rd_pubactions)
+ (void) RelationGetInvalidRowFilterCol(relation);
+
+ return memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
..
..
}
I think here we can reverse the check such that if actions are set
just do memcpy and return otherwise get the relationpublicationactions
info.
4.
invalid_rowfilter_column_walker
{
..
/*
* If pubviaroot is true, we need to convert the column number of
* parent to the column number of child relation first.
*/
if (context->pubviaroot)
{
char *colname = get_attname(context->parentid, attnum, false);
attnum = get_attnum(context->relid, colname);
}
Here, in the comments, you can tell why you need this conversion. Can
we name this function as rowfilter_column_walker()?
5.
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcolnum; /* invalid column number */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+ bool pubviaroot; /* true if we are validating the parent
+ * relation's row filter */
+ Oid relid; /* relid of the relation */
+ Oid parentid; /* relid of the parent relation */
+} rf_context;
Normally, we declare structs at the beginning of the file and for the
formatting of struct declarations, see other nearby structs like
RelIdCacheEnt.
6. Can we name IsRowFilterSimpleNode() as IsRowFilterSimpleExpr()?
--
With Regards,
Amit Kapila.
On Fri, Dec 17, 2021 at 1:50 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
So using the v47 patch-set, I still find that the UPDATE above results in publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.+ * old-row (no match) new-row (no match) -> (drop change) + * old-row (no match) new row (match) -> INSERT + * old-row (match) new-row (no match) -> DELETE + * old-row (match) new row (match) -> UPDATEThis is because the old row (1,1) doesn't match the UPDATE filter "(a>1)", but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be assumed that (1,1) was never published (and thus requires an INSERT rather than UPDATE) based on these checks, because in this example, (1,1) was previously published via a different operation - INSERT (and using a different filter too).
I think the fundamental problem here is that these UPDATE rules assume that the old (current) row was previously UPDATEd (and published, or not published, according to the filter applicable to UPDATE), but this is not necessarily the case.
Or am I missing something?But it need not be correct in assuming that the old-row was part of a
previous INSERT either (and published, or not published according to
the filter applicable to an INSERT).
For example, change the sequence of inserts and updates prior to the
last update:truncate tbl1 ;
insert into tbl1 values (1,5); ==> not replicated since insert and ! (b < 2);
update tbl1 set b = 1; ==> not replicated since update and ! (a > 1)
update tbl1 set a = 2; ==> replicated and update converted to insert
since (a > 1)In this case, the last update "update tbl1 set a = 2; " is updating a
row that was previously updated and not inserted and not replicated to
the subscriber.
How does the replication logic differentiate between these two cases,
and decide if the update was previously published or not?
I think it's futile for the publisher side to try and figure out the
history of published rows.
I also think so. One more thing, even if we want we might not be able
to apply the insert filter as the corresponding values may not be
logged.
--
With Regards,
Amit Kapila.
On Fri, Dec 17, 2021 at 7:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
So using the v47 patch-set, I still find that the UPDATE above results in publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.+ * old-row (no match) new-row (no match) -> (drop change) + * old-row (no match) new row (match) -> INSERT + * old-row (match) new-row (no match) -> DELETE + * old-row (match) new row (match) -> UPDATEThis is because the old row (1,1) doesn't match the UPDATE filter "(a>1)", but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be assumed that (1,1) was never published (and thus requires an INSERT rather than UPDATE) based on these checks, because in this example, (1,1) was previously published via a different operation - INSERT (and using a different filter too).
I think the fundamental problem here is that these UPDATE rules assume that the old (current) row was previously UPDATEd (and published, or not published, according to the filter applicable to UPDATE), but this is not necessarily the case.
Or am I missing something?But it need not be correct in assuming that the old-row was part of a
previous INSERT either (and published, or not published according to
the filter applicable to an INSERT).
For example, change the sequence of inserts and updates prior to the
last update:truncate tbl1 ;
insert into tbl1 values (1,5); ==> not replicated since insert and ! (b < 2);
update tbl1 set b = 1; ==> not replicated since update and ! (a > 1)
update tbl1 set a = 2; ==> replicated and update converted to insert
since (a > 1)In this case, the last update "update tbl1 set a = 2; " is updating a
row that was previously updated and not inserted and not replicated to
the subscriber.
How does the replication logic differentiate between these two cases,
and decide if the update was previously published or not?
I think it's futile for the publisher side to try and figure out the
history of published rows. In fact, if this level of logic is required
then it is best implemented on the subscriber side, which then defeats
the purpose of a publication filter.
I think it's a concern, for such a basic example with only one row,
getting unpredictable (and even wrong) replication results, depending
upon the order of operations.
Doesn't this problem result from allowing different WHERE clauses for
different pubactions for the same table?
My current thoughts are that this shouldn't be allowed, and also WHERE
clauses for INSERTs should, like UPDATE and DELETE, be restricted to
using only columns covered by the replica identity or primary key.
Regards,
Greg Nancarrow
Fujitsu Australia
PSA the v48* patch set.
Main differences from v47:
1. Addresses some review comments
~~
Details:
v47-0001 (main)
- Modify some regression tests [Vignesh 2/12] #1 (skipped), #4
- Remove redundant slot drop [Greg 15/12] #7
- Restore mode of gram.y file [Alvaro 16/12]
v47-0002 (validation)
- Modify some regression tests [Vignesh 2/12] #3
- Don't allow system columns in filters [Houz 16/12]
v47-0003 (new/old tuple)
- No change
v47-0004 (tab-complete and dump)
- No change
v47-0005 (for all tables)
- No change
------
[Vignesh 2/12] /messages/by-id/CALDaNm2bMD=wxOzMvfnHQ7LeGTPyZWy_Fu_8G24k7MJ7k1UqHQ@mail.gmail.com
[Greg 15/12] /messages/by-id/CAJcOf-dFo_kTroR2_k1x80TqN=-3oZC_2BGYe1O6e5JinrLKYg@mail.gmail.com
[Alvaro 16/12] /messages/by-id/202112162011.iiyqqzuzpg4x@alvherre.pgsql
[Houz 16/12] /messages/by-id/OS0PR01MB571694C3C0005B5D425CCB0694779@OS0PR01MB5716.jpnprd01.prod.outlook.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v48-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v48-0001-Row-filter-for-logical-replication.patchDownload
From 3ba15c0648d6d9504533bcfc872a560d2a7ddcdd Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 18:59:42 +1100
Subject: [PATCH v48] Row-filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row-filter is per table. A new row-filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. If the row-filter evaluates to NULL,
it returns false. The WHERE clause allows simple expressions. Simple expressions
cannot contain any aggregate or window functions, non-immutable functions,
user-defined types, operators or functions. This restriction could possibly be
addressed in the future.
If you choose to do the initial table synchronization, only data that satisfies
the row-filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expressions will be copied. If a subscriber is a
pre-15 version, the initial table synchronization won't use row-filters even
if they are defined in the publisher.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row-filter (if
the parameter is false, the default) or the root partitioned table row-filter.
Psql commands \dRp+ and \d+ will display any row-filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication row-filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Cache ExprState per pubaction.
If a subscriber has multiple publications and these publications include the
same table then there can be multiple filters that apply to that table.
These filters are stored per-pubactions of the publications. There are 4 kinds
of pubaction ("insert", "update", "delete", "truncate"), but row-filters are
not applied for "truncate".
Filters for the same pubaction are all combined (OR'ed) and cached as one, so
at the end there are at most 3 cached filters per table.
The appropriate (pubaction) filter is executed according to the DML operation.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1%2BhVXfOSScbf5LUB%3D5is%3DwYaC6NBhLxuvetbWQnZRnsVQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 37 ++-
doc/src/sgml/ref/create_subscription.sgml | 24 +-
src/backend/catalog/pg_publication.c | 69 ++++-
src/backend/commands/publicationcmds.c | 108 +++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/parser/parse_relation.c | 9 +
src/backend/replication/logical/tablesync.c | 118 +++++++-
src/backend/replication/pgoutput/pgoutput.c | 410 +++++++++++++++++++++++++++-
src/bin/psql/describe.c | 26 +-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 151 ++++++++++
src/test/regress/sql/publication.sql | 76 ++++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++
24 files changed, 1451 insertions(+), 51 deletions(-)
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537..2f1f913 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6311,6 +6311,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..5d9869c 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e..5aeee23 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -79,6 +79,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ If the optional <literal>WHERE</literal> clause is specified, only rows
+ that satisfy the <replaceable class="parameter">expression</replaceable>
+ will be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
+ </para>
+
+ <para>
Only persistent base tables and partitioned tables can be part of a
publication. Temporary tables, unlogged tables, foreign tables,
materialized views, and regular views cannot be part of a publication.
@@ -226,6 +233,22 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ A <literal>WHERE</literal> clause must contain only columns that are
+ part of the primary key or are covered by the <literal>REPLICA
+ IDENTITY</literal>, in order for <command>UPDATE</command> and
+ <command>DELETE</command> operations to be published.
+ For publication of <command>INSERT</command> operations, any column
+ may be used in the <literal>WHERE</literal> clause.
+ If nullable columns are present in the <literal>WHERE</literal> clause,
+ possible NULL values should be accounted for in expressions, to avoid
+ unexpected results, because <literal>NULL</literal> values can cause
+ those expressions to evaluate to false.
+ A <literal>WHERE</literal> clause allows simple expressions. The simple
+ expression cannot contain any aggregate or window functions, non-immutable
+ functions, user-defined types, operators or functions.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -247,6 +270,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -260,6 +288,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..db255f3 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ If the publications contain conditional expressions, it will affect
+ what data is copied. Refer to the
+ <xref linkend="sql-createsubscription-notes" /> for details.
+ </para>
</listitem>
</varlistentry>
@@ -293,7 +298,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</variablelist>
</refsect1>
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
<title>Notes</title>
<para>
@@ -319,6 +324,23 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published (i.e. they will be filtered out).
+ If the subscription has several publications in which the same table has been
+ published with different <literal>WHERE</literal> clauses, those expressions
+ (for the same publish operation) get OR'ed together so that rows satisfying any
+ of the expressions will be published. Also, if one of the publications for the
+ same table has no <literal>WHERE</literal> clause at all, or is a <literal>FOR
+ ALL TABLES</literal> or <literal>FOR ALL TABLES IN SCHEMA</literal> publication,
+ then all other <literal>WHERE</literal> clauses (for the same publish operation)
+ become redundant.
+ If the subscriber is a <productname>PostgreSQL</productname> version before 15
+ then any row filtering is ignored during the initial data synchronization phase.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 62f10bc..0929aa0 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -276,21 +279,54 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
}
/*
+ * Transform a publication WHERE clause, ensuring it is coerced to boolean and
+ * necessary collation information is added if required, and add a new
+ * nsitem/RTE for the associated relation to the ParseState's namespace list.
+ */
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool fixup_collation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock, NULL, false, false);
+
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate, copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (fixup_collation)
+ assign_expr_collations(pstate, whereclause);
+
+ return whereclause;
+}
+
+/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -311,10 +347,22 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /*
+ * Get the transformed WHERE clause, of boolean type, with necessary
+ * collation information.
+ */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -328,6 +376,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -344,6 +398,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 404bb5d..9ca743c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,96 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+ Node *oldrelwhereclause = NULL;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
+ if (oldrelwhereclause)
+ pfree(oldrelwhereclause);
- delrels = lappend(delrels, pubrel);
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
+ if (!found)
+ {
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +955,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +983,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1035,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1044,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1064,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1161,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df0b747..bd55ea6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4833,6 +4833,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index cb7ddd4..028b8e5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3d4dd43..9da93a0 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9742,12 +9742,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9762,28 +9763,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17431,7 +17449,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17444,6 +17463,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index c5c3f26..036d9c6 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,11 +3538,20 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
+ {
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_TABLE),
+ errmsg("publication WHERE expression invalid reference to table \"%s\"",
+ relation->relname),
+ parser_errposition(pstate, relation->location)));
+
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
+ }
}
/*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..c20c221 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -687,20 +687,24 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
- * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * Get information about a remote relation, in a similar fashion to how the
+ * RELATION message provides information during replication. This function also
+ * returns the relation qualifications to be used in the COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row filter expressions for the same table will be combined
+ * by COPY using OR. If any of the filter expressions for this table are
+ * null, it means the whole table will be copied. In this case it is not
+ * necessary to construct a unified row filter expression at all.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ /*
+ * One entry without a row filter expression means clean up
+ * previous expressions (if there are any) and return with no
+ * expressions.
+ */
+ if (isnull)
+ {
+ if (*qual)
+ {
+ list_free_deep(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..2fa08e7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,24 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * ExprState cannot be used to indicate no cache, invalid cache and valid
+ * cache, so the flag exprstate_valid indicates if the current cache is
+ * valid.
+ *
+ * Multiple ExprState entries might be used if there are multiple
+ * publications for a single table. Different publication actions don't
+ * allow multiple expressions to always be combined into one, so there is
+ * one ExprSTate per publication action. Only 3 publication actions are used
+ * for row filtering ("insert", "update", "delete"). The exprstate array is
+ * indexed by ReorderBufferChangeType.
+ */
+ bool exprstate_valid;
+#define IDX_PUBACTION_n 3
+ ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
+ One per publication action. */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +163,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +172,14 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +655,316 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->exprstate_valid)
+ {
+ MemoryContext oldctx;
+ int idx;
+ bool found_filters = false;
+ int idx_ins = REORDER_BUFFER_CHANGE_INSERT;
+ int idx_upd = REORDER_BUFFER_CHANGE_UPDATE;
+ int idx_del = REORDER_BUFFER_CHANGE_DELETE;
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple publications might have multiple row filters for this
+ * relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list (per
+ * pubaction). If no, then remember there was no filter for this pubaction.
+ * Code following this 'publications' loop will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ /* Gather the rfnodes per pubaction of this publiaction. */
+ if (pub->pubactions.pubinsert)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_ins] = lappend(rfnodes[idx_ins], rfnode);
+ }
+ if (pub->pubactions.pubupdate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_upd] = lappend(rfnodes[idx_upd], rfnode);
+ }
+ if (pub->pubactions.pubdelete)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_del] = lappend(rfnodes[idx_del], rfnode);
+ }
+ MemoryContextSwitchTo(oldctx);
+ }
+ else
+ {
+ /* Remember which pubactions have no row-filter. */
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ {
+ ReleaseSysCache(rftuple);
+ break;
+ }
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Now all the filters for all pubactions are known. Combine them when
+ * their pubactions are same.
+ *
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because all
+ * expressions are aggregated by the OR operator. The row filter absence
+ * means replicate all rows so a single valid expression means publish
+ * this row.
+ */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ int n_filters;
+
+ if (no_filter[idx])
+ {
+ if (rfnodes[idx])
+ {
+ list_free_deep(rfnodes[idx]);
+ rfnodes[idx] = NIL;
+ }
+ }
+
+ /*
+ * If there was one or more filter for this pubaction then combine them
+ * (if necessary) and cache the ExprState.
+ */
+ n_filters = list_length(rfnodes[idx]);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) : linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ found_filters = true; /* flag that we will need slots made */
+ }
+ } /* for each pubaction */
+
+ if (found_filters)
+ {
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->exprstate_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate (for this pubaction).
+ */
+ if (entry->exprstate[changetype])
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[changetype], ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +991,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +1015,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +1022,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +1055,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1089,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1158,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1480,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1504,13 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->exprstate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1615,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1310,6 +1677,7 @@ static void
rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ int idx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1354,6 +1722,25 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->exprstate_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ /* Cleanup the ExprState for each of the pubactions. */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
+ }
}
}
@@ -1365,6 +1752,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1762,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1782,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c28788e..929b2f5 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2868,17 +2868,21 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
"WHERE pr.prrelid = '%s'\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -2914,6 +2918,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ /* row filter (if any) */
+ if (pset.sversion >= 150000)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE %s", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -5833,8 +5844,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE %s", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -5963,8 +5978,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 902f2f2..96c55f6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -123,13 +125,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
extern bool is_publishable_relation(Relation rel);
extern bool is_schema_publication(Oid pubid);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4c5a8a3..e437a55 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3642,6 +3642,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5ac2d66..5a49003 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,157 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub_dplus_rf_no"
+ "testpub_dplus_rf_yes" WHERE (a > 1)
+
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (e < 999)
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - aggregate functions not allowed in WHERE clause
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...TION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+ ^
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tbl6" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 56dd358..47bdba8 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,82 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - aggregate functions not allowed in WHERE clause
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v48-0003-Row-filter-updates-based-on-old-new-tuples.patchapplication/octet-stream; name=v48-0003-Row-filter-updates-based-on-old-new-tuples.patchDownload
From 3db17a7a7386fd9a2f4ed33a9c5dacc68446e4d3 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:39:04 +1100
Subject: [PATCH v48] Row-filter updates based on old/new tuples
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 38 +++--
src/backend/replication/pgoutput/pgoutput.c | 228 ++++++++++++++++++++++++----
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 235 insertions(+), 49 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..110ccff 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,13 +751,16 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
+ Datum attr_values[MaxTupleAttributeNumber];
+ bool attr_isnull[MaxTupleAttributeNumber];
desc = RelationGetDescr(rel);
@@ -771,7 +776,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (TupIsNull(slot))
+ {
+ values = attr_values;
+ isnull = attr_isnull;
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
@@ -832,6 +847,7 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
ReleaseSysCache(typtup);
}
+
}
/*
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 2fa08e7..8a733cb 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -140,6 +142,9 @@ typedef struct RelationSyncEntry
ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
One per publication action. */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -174,11 +179,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ TupleTableSlot *slot, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -742,26 +751,124 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
- RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
- ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
- List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
- bool no_filter[] = {false, false, false}; /* One per pubaction */
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
changetype == REORDER_BUFFER_CHANGE_UPDATE ||
changetype == REORDER_BUFFER_CHANGE_DELETE);
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /* Clear the tuples */
+ ExecClearTuple(entry->old_tuple);
+ ExecClearTuple(entry->new_tuple);
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(changetype, relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ ListCell *lc;
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
/*
* If the row filter caching is currently flagged "invalid" then it means we
* don't know yet if there is/isn't any row filters for this relation.
@@ -921,11 +1028,34 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->exprstate_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
/* Bail out if there is no row filter */
if (!entry->exprstate[changetype])
@@ -944,7 +1074,12 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -957,7 +1092,6 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -1015,6 +1149,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -1022,10 +1159,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
- break;
-
/*
* Schema should be sent before the logic that replaces the
* relation because it also sends the ancestor's relation.
@@ -1043,6 +1176,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, NULL, tuple,
+ NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -1054,10 +1192,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
-
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
- break;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1078,9 +1213,34 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter */
+ if (!pgoutput_row_filter_update_check(change->action, relation,
+ oldtuple, newtuple, relentry,
+ &modified_action))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1089,10 +1249,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
- break;
-
maybe_send_schema(ctx, change, relation, relentry);
/* Switch relation if publishing via root. */
@@ -1106,6 +1262,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, oldtuple,
+ NULL, NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -1508,6 +1669,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 89f3917..a9a1d0d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2200,6 +2200,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v48-0005-Row-filter-handle-FOR-ALL-TABLES.patchapplication/octet-stream; name=v48-0005-Row-filter-handle-FOR-ALL-TABLES.patchDownload
From 8bbe95240ccdba84d533ca54508ded8ef3d60824 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:41:40 +1100
Subject: [PATCH v48] Row-filter handle FOR ALL TABLES
If one of the subscriber's publications was created using FOR ALL TABLES then
that implies NO row-filtering will be applied.
If one of the subscriber's publications was created using FOR ALL TABLES IN
SCHEMA and the table belong to that same schmea, then that also implies NO
row-filtering will be applied.
These rules overrides any other row-filters from other subscribed publications.
Note that the initial COPY does not take publication operations into account.
Author: Peter Smith
Reported By: Tang
Discussion: https://www.postgresql.org/message-id/OS0PR01MB6113D82113AA081ACF710D0CFB6E9%40OS0PR01MB6113.jpnprd01.prod.outlook.com
---
src/backend/replication/logical/tablesync.c | 48 +++++++----
src/backend/replication/pgoutput/pgoutput.c | 63 ++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 118 ++++++++++++++++++++++++++--
3 files changed, 202 insertions(+), 27 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index c20c221..469aadc 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -802,21 +802,22 @@ fetch_remote_table_info(char *nspname, char *relname,
walrcv_clear_result(res);
/*
+ * If any publication has puballtables true then all row-filtering is
+ * ignored.
+ *
+ * If the relation is a member of a schema of a subscribed publication that
+ * said ALL TABLES IN SCHEMA then all row-filtering is ignored.
+ *
* Get relation qual. DISTINCT avoids the same expression of a table in
* multiple publications from being included multiple times in the final
* expression.
*/
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
{
- resetStringInfo(&cmd);
- appendStringInfo(&cmd,
- "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
- " FROM pg_publication p "
- " INNER JOIN pg_publication_rel pr "
- " ON (p.oid = pr.prpubid) "
- " WHERE pr.prrelid = %u "
- " AND p.pubname IN (", lrel->remoteid);
+ StringInfoData pub_names;
+ /* Build the pubname list. */
+ initStringInfo(&pub_names);
first = true;
foreach(lc, MySubscription->publications)
{
@@ -825,11 +826,28 @@ fetch_remote_table_info(char *nspname, char *relname,
if (first)
first = false;
else
- appendStringInfoString(&cmd, ", ");
+ appendStringInfoString(&pub_names, ", ");
- appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ appendStringInfoString(&pub_names, quote_literal_cstr(pubname));
}
- appendStringInfoChar(&cmd, ')');
+
+ /* Check for row-filters */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u AND p.pubname IN ( %s ) "
+ " AND NOT (select bool_or(puballtables) "
+ " FROM pg_publication "
+ " WHERE pubname in ( %s )) "
+ " AND (SELECT count(1)=0 "
+ " FROM pg_publication_namespace pn, pg_class c "
+ " WHERE c.oid = %u AND c.relnamespace = pn.pnnspid)",
+ lrel->remoteid,
+ pub_names.data,
+ pub_names.data,
+ lrel->remoteid);
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
@@ -847,18 +865,14 @@ fetch_remote_table_info(char *nspname, char *relname,
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- Datum rf = slot_getattr(slot, 1, &isnull);
+ Datum rf = slot_getattr(slot, 1, &isnull);
if (!isnull)
*qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
ExecClearTuple(slot);
- /*
- * One entry without a row filter expression means clean up
- * previous expressions (if there are any) and return with no
- * expressions.
- */
+ /* Ignore filters and cleanup as necessary. */
if (isnull)
{
if (*qual)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 8a733cb..43d0125 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -912,13 +912,68 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
* relation. Since row filter usage depends on the DML operation,
* there are multiple lists (one for each operation) which row filters
* will be appended.
+ *
+ * NOTE: FOR ALL TABLES implies "use no filters" so it takes precedence
+ *
+ * NOTE: ALL TABLES IN SCHEMA also implies "use not filters" if the
+ * table is a member of the same schema.
*/
foreach(lc, data->publications)
{
- Publication *pub = lfirst(lc);
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ List *schemarelids = NIL;
+
+ /*
+ * If the publication is FOR ALL TABLES then it is treated the same
+ * as if this table has no filters (even if for some other
+ * publication it does).
+ */
+ if (pub->alltables)
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;
+ }
+
+ /*
+ * If the publication is FOR ALL TABLES IN SCHEMA and it overlaps with the
+ * current relation in the same schema then this is also treated same as if
+ * this table has no filters (even if for some other publication it does).
+ */
+ schemarelids = GetAllSchemaPublicationRelations(pub->oid,
+ pub->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ if (list_member_oid(schemarelids, entry->relid))
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ list_free(schemarelids);
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;
+ }
+ list_free(schemarelids);
/*
* Lookup if there is a row-filter, and if yes remember it in a list (per
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index a2f25f6..73add45 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 10;
+use Test::More tests => 14;
# create publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
@@ -15,6 +15,116 @@ my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init(allows_streaming => 'logical');
$node_subscriber->start;
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+
+# ====================================================================
+# Testcase start: FOR ALL TABLES
+#
+# The FOR ALL TABLES test must come first so that it is not affected by
+# all the other test tables that are later created.
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres', "CREATE TABLE tab_rf_x (x int primary key)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_rf_x (x int primary key)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rf_x (x) VALUES (0), (5), (10), (15), (20)");
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_x FOR TABLE tab_rf_x WHERE (x > 10)");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_forall FOR ALL TABLES");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_x, tap_pub_forall");
+
+$node_publisher->wait_for_catchup($appname);
+# wait for initial table synchronization to finish
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# The subscription of the FOR ALL TABLES publication means there should be no
+# filtering on the tablesync COPY, so all expect all 5 will be present.
+my $result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_forall");
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_x");
+$node_publisher->safe_psql('postgres', "DROP TABLE tab_rf_x");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
+$node_subscriber->safe_psql('postgres', "DROP TABLE tab_rf_x");
+
+# Testcase end: FOR ALL TABLES
+# ====================================================================
+
+# ====================================================================
+# Testcase start: ALL TABLES IN SCHEMA
+#
+# The ALL TABLES IN SCHEMA test is independent of all other test cases so it
+# cleans up after itself.
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres', "CREATE SCHEMA schema_rf_x");
+$node_publisher->safe_psql('postgres', "CREATE TABLE schema_rf_x.tab_rf_x (x int primary key)");
+$node_subscriber->safe_psql('postgres', "CREATE SCHEMA schema_rf_x");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE schema_rf_x.tab_rf_x (x int primary key)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO schema_rf_x.tab_rf_x (x) VALUES (0), (5), (10), (15), (20)");
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_x FOR TABLE schema_rf_x.tab_rf_x WHERE (x > 10)");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_allinschema FOR ALL TABLES IN SCHEMA schema_rf_x");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_x, tap_pub_allinschema");
+
+$node_publisher->wait_for_catchup($appname);
+# wait for initial table synchronization to finish
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# The subscription of the ALL TABLES IN SCHEMA publication means there should be
+# no filtering on the tablesync COPY, so all expect all 5 will be present.
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM schema_rf_x.tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO schema_rf_x.tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM schema_rf_x.tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_allinschema");
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_x");
+$node_publisher->safe_psql('postgres', "DROP TABLE schema_rf_x.tab_rf_x");
+$node_publisher->safe_psql('postgres', "DROP SCHEMA schema_rf_x");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
+$node_subscriber->safe_psql('postgres', "DROP TABLE schema_rf_x.tab_rf_x");
+$node_subscriber->safe_psql('postgres', "DROP SCHEMA schema_rf_x");
+
+# Testcase end: ALL TABLES IN SCHEMA
+# ====================================================================
+
# setup structure on publisher
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
@@ -127,8 +237,6 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
-my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
-my $appname = 'tap_sub';
$node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
);
@@ -136,8 +244,6 @@ $node_subscriber->safe_psql('postgres',
$node_publisher->wait_for_catchup($appname);
# wait for initial table synchronization to finish
-my $synced_query =
- "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
$node_subscriber->poll_query_until('postgres', $synced_query)
or die "Timed out while waiting for subscriber to synchronize data";
@@ -148,7 +254,7 @@ $node_subscriber->poll_query_until('postgres', $synced_query)
# - INSERT (1980, 'not filtered') YES
# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
#
-my $result =
+$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is( $result, qq(1001|test 1001
--
1.8.3.1
v48-0004-Row-filter-tab-auto-complete-and-pgdump.patchapplication/octet-stream; name=v48-0004-Row-filter-tab-auto-complete-and-pgdump.patchDownload
From 51f8f90208c3ad255af7844cd547c88c2014f43e Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:40:15 +1100
Subject: [PATCH v48] Row-filter tab auto-complete and pgdump
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 24 ++++++++++++++++++++++--
3 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 784771c..4acae2a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4034,6 +4034,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4044,9 +4045,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4055,6 +4063,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4095,6 +4104,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4162,8 +4175,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f011ace..0ebdce5 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b524dc8..1d47634 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,19 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /*
+ * "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with
+ * table attributes
+ *
+ * "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,13 +2790,20 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
/*
+ * "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
+
+ /*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
* ..."
*/
--
1.8.3.1
v48-0002-Row-filter-validation.patchapplication/octet-stream; name=v48-0002-Row-filter-validation.patchDownload
From 965f057d2dd5792bcae138d3aa3b13bf2ad134f7 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:37:32 +1100
Subject: [PATCH v48] Row-filter validation
This patch implements parse-tree "walkers" to validate a row-filter.
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifically:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system columns.
- no system functions (unless they are immutable). See design decision at [1].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
Permits only simple nodes including:
List, Const, BoolExpr, NullIfExpr, NullTest, BooleanTest, CoalesceExpr,
CaseExpr, CaseTestExpr, MinMaxExpr, ArrayExpr, ScalarArrayOpExpr, XmlExpr.
Author: Peter Smith, Euler Taveira
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" "update", validate that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Row filter columns invalidation is done in CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on
the published relation. This is consistent with the existing check about
replica identity and can detect the change related to the row filter in time.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It is safe to do this because every
operation that change the row filter and replica identity will invalidate the
relcache.
Author: Hou zj
---
src/backend/catalog/pg_publication.c | 146 ++++++++++++++++-
src/backend/executor/execReplication.c | 36 +++-
src/backend/parser/parse_agg.c | 10 --
src/backend/parser/parse_expr.c | 21 +--
src/backend/parser/parse_func.c | 3 -
src/backend/parser/parse_oper.c | 7 -
src/backend/parser/parse_relation.c | 9 -
src/backend/utils/cache/relcache.c | 262 ++++++++++++++++++++++++++----
src/include/parser/parse_node.h | 1 -
src/include/utils/rel.h | 7 +
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 231 +++++++++++++++++++++-----
src/test/regress/sql/publication.sql | 178 +++++++++++++++++---
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
15 files changed, 760 insertions(+), 160 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 0929aa0..971110e 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -112,6 +114,137 @@ check_publication_add_schema(Oid schemaid)
}
/*
+ * Is this a simple Node permitted within a row filter expression?
+ */
+static bool
+IsRowFilterSimpleNode(Node *node)
+{
+ switch (nodeTag(node))
+ {
+ case T_ArrayExpr:
+ case T_BooleanTest:
+ case T_BoolExpr:
+ case T_CaseExpr:
+ case T_CaseTestExpr:
+ case T_CoalesceExpr:
+ case T_Const:
+ case T_List:
+ case T_MinMaxExpr:
+ case T_NullIfExpr:
+ case T_NullTest:
+ case T_ScalarArrayOpExpr:
+ case T_XmlExpr:
+ return true;
+ default:
+ return false;
+ }
+}
+
+/*
+ * The row filter walker checks if the row filter expression is a "simple
+ * expression".
+ *
+ * It allows only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - Non-immutable builtin functions are not allowed.
+ * - System columns are not allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-definition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because
+ * non-immutable functions can access the database and would lead to the problem
+ * (b) mentioned in the previous paragraph.
+ */
+static bool
+rowfilter_walker(Node *node, Relation relation)
+{
+ char *errdetail_msg = NULL;
+
+ if (node == NULL)
+ return false;
+
+
+ if (IsRowFilterSimpleNode(node))
+ {
+ /* OK, node is part of simple expressions */
+ }
+ else if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined types are not allowed");
+
+ /* System columns not allowed. */
+ else if (var->varattno < InvalidAttrNumber)
+ {
+ Oid relid = RelationGetRelid(relation);
+ const char *colname = get_attname(relid, var->varattno, false);
+
+ errdetail_msg = psprintf(_("Cannot use system column (%s)."), colname);
+ }
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *) node)->opno >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined operators are not allowed.");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *) node)->funcid;
+ const char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ errdetail_msg = psprintf(_("User-defined functions are not allowed (%s)."),
+ funcname);
+ else if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ errdetail_msg = psprintf(_("Non-immutable built-in functions are not allowed (%s)."),
+ funcname);
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("Expressions only allow columns, constants and some built-in functions and operators.")
+ ));
+ }
+
+ if (errdetail_msg)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("%s", errdetail_msg)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)relation);
+}
+
+/*
* Returns if relation represented by oid and Form_pg_class entry
* is publishable.
*
@@ -241,10 +374,6 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
- */
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -298,7 +427,7 @@ GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
addNSItemToQuery(pstate, nsitem, false, true, true);
whereclause = transformWhereClause(pstate, copyObject(pri->whereClause),
- EXPR_KIND_PUBLICATION_WHERE,
+ EXPR_KIND_WHERE,
"PUBLICATION WHERE");
/* Fix up collation information */
@@ -362,6 +491,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
* collation information.
*/
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d2..c175954 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -568,14 +568,46 @@ void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
PublicationActions *pubactions;
+ AttrNumber bad_rfcolnum;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ return;
+
+ bad_rfcolnum = RelationGetInvalidRowFilterCol(rel);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns referenced in
+ * the row filters from publications which the relation is in are valid,
+ * which means all referenced columns are part of REPLICA IDENTITY, or the
+ * table do not publish UPDATES or DELETES.
+ */
+ if (AttributeNumberIsValid(bad_rfcolnum))
+ {
+ const char *colname = get_attname(RelationGetRelid(rel),
+ bad_rfcolnum, false);
+
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ }
+
/* If relation has replica identity we are always good. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
- OidIsValid(RelationGetReplicaIndex(rel)))
+ if (OidIsValid(RelationGetReplicaIndex(rel)))
return;
/*
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..7d829a0 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,13 +551,6 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
- break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,9 +943,6 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
- break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..2d1a477 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -515,7 +504,6 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
- case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1776,9 +1764,6 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
- break;
/*
* There is intentionally no default: case here, so that the
@@ -3099,8 +3084,6 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
- case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..542f916 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,9 +2655,6 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
- break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 036d9c6..c5c3f26 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,20 +3538,11 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
- {
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_UNDEFINED_TABLE),
- errmsg("publication WHERE expression invalid reference to table \"%s\"",
- relation->relname),
- parser_errposition(pstate, relation->location)));
-
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
- }
}
/*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 105d8d4..ed04881 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -71,6 +72,8 @@
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_relation.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rowsecurity.h"
#include "storage/lmgr.h"
@@ -84,6 +87,7 @@
#include "utils/memutils.h"
#include "utils/relmapper.h"
#include "utils/resowner_private.h"
+#include "utils/ruleutils.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
@@ -5521,57 +5525,169 @@ RelationGetExclusionInfo(Relation indexRelation,
MemoryContextSwitchTo(oldcxt);
}
+/* For invalid_rowfilter_column_walker. */
+typedef struct {
+ AttrNumber invalid_rfcolnum; /* invalid column number */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+ bool pubviaroot; /* true if we are validating the parent
+ * relation's row filter */
+ Oid relid; /* relid of the relation */
+ Oid parentid; /* relid of the parent relation */
+} rf_context;
+
/*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row-filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcolnum.
*/
-struct PublicationActions *
-GetRelationPublicationActions(Relation relation)
+static bool
+invalid_rowfilter_column_walker(Node *node, rf_context *context)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ /*
+ * If pubviaroot is true, we need to convert the column number of
+ * parent to the column number of child relation first.
+ */
+ if (context->pubviaroot)
+ {
+ char *colname = get_attname(context->parentid, attnum, false);
+ attnum = get_attnum(context->relid, colname);
+ }
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ context->bms_replident))
+ {
+ context->invalid_rfcolnum = attnum;
+ return true;
+ }
+ }
+
+ return expression_tree_walker(node, invalid_rowfilter_column_walker,
+ (void *) context);
+}
+
+/*
+ * Append to cur_puboids each member of add_puboids that isn't already in
+ * cur_puboids.
+ *
+ * Also update the top most parent relation's relid in the publication.
+ */
+static void
+concat_publication_oid(Oid relid,
+ List **cur_puboids,
+ List **toprelid_in_pub,
+ const List *add_puboids)
+{
+ ListCell *lc1,
+ *lc2,
+ *lc3;
+
+ foreach(lc1, add_puboids)
+ {
+ bool is_member = false;
+
+ forboth(lc2, *cur_puboids, lc3, *toprelid_in_pub)
+ {
+ if (lfirst_oid(lc2) == lfirst_oid(lc1))
+ {
+ is_member = true;
+ lfirst_oid(lc3) = relid;
+ }
+ }
+
+ if (!is_member)
+ {
+ *cur_puboids = lappend_oid(*cur_puboids, lfirst_oid(lc1));
+ *toprelid_in_pub = lappend_oid(*toprelid_in_pub, relid);
+ }
+ }
+}
+
+/*
+ * Get the invalid row filter column number for the given relation.
+ *
+ * Traverse all the publications which the relation is in to get the
+ * publication actions. If the publication actions include UPDATE or DELETE,
+ * then validate that if all columns referenced in the row filter expression
+ * are part of REPLICA IDENTITY.
+ *
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, otherwise InvalidAttrNumber.
+ */
+AttrNumber
+RelationGetInvalidRowFilterCol(Relation relation)
+{
+ List *puboids,
+ *toprelid_in_pub;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ Oid relid = RelationGetRelid(relation);
+ rf_context context = { 0 };
+ PublicationActions pubactions = { 0 };
+ bool rfcol_valid = true;
+ AttrNumber invalid_rfcolnum = InvalidAttrNumber;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
- if (!is_publishable_relation(relation))
- return pubactions;
-
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (!is_publishable_relation(relation) || relation->rd_rfcol_valid)
+ return invalid_rfcolnum;
/* Fetch the publication membership info. */
- puboids = GetRelationPublications(RelationGetRelid(relation));
+ toprelid_in_pub = puboids = NIL;
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetRelationPublications(relid));
schemaid = RelationGetNamespace(relation);
- puboids = list_concat_unique_oid(puboids, GetSchemaPublications(schemaid));
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
if (relation->rd_rel->relispartition)
{
/* Add publications that the ancestors are in too. */
- List *ancestors = get_partition_ancestors(RelationGetRelid(relation));
+ List *ancestors = get_partition_ancestors(relid);
ListCell *lc;
foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
- puboids = list_concat_unique_oid(puboids,
- GetRelationPublications(ancestor));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetRelationPublications(ancestor));
schemaid = get_rel_namespace(ancestor);
- puboids = list_concat_unique_oid(puboids,
- GetSchemaPublications(schemaid));
+ concat_publication_oid(ancestor, &puboids, &toprelid_in_pub,
+ GetSchemaPublications(schemaid));
}
+
+ relid = llast_oid(ancestors);
}
- puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ concat_publication_oid(relid, &puboids, &toprelid_in_pub,
+ GetAllTablesPublications());
+
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTITY DEFAULT means primary key or nothing.
+ */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
HeapTuple tup;
+
Form_pg_publication pubform;
tup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
@@ -5581,35 +5697,116 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubactions.pubinsert |= pubform->pubinsert;
+ pubactions.pubupdate |= pubform->pubupdate;
+ pubactions.pubdelete |= pubform->pubdelete;
+ pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UPDATE and DELETE, validates
+ * that any columns referenced in the filter expression are part of
+ * REPLICA IDENTITY index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ *
+ * If we already found the column in row filter which is not part
+ * of REPLICA IDENTITY index, skip the validation too.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if ((pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL &&
+ rfcol_valid)
+ {
+ HeapTuple rftuple;
+
+ if (pubform->pubviaroot)
+ relid = list_nth_oid(toprelid_in_pub,
+ foreach_current_index(lc));
+ else
+ relid = RelationGetRelid(relation);
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ context.pubviaroot = pubform->pubviaroot;
+ context.parentid = relid;
+ context.relid = RelationGetRelid(relation);
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !invalid_rowfilter_column_walker(rfnode,
+ &context);
+ invalid_rfcolnum = context.invalid_rfcolnum;
+ pfree(rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
+ */
+ if (pubactions.pubinsert && pubactions.pubupdate &&
+ pubactions.pubdelete && pubactions.pubtruncate &&
+ !rfcol_valid)
break;
}
+ bms_free(context.bms_replident);
+
if (relation->rd_pubactions)
{
pfree(relation->rd_pubactions);
relation->rd_pubactions = NULL;
}
+ relation->rd_rfcol_valid = rfcol_valid;
+
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ memcpy(relation->rd_pubactions, &pubactions, sizeof(PublicationActions));
MemoryContextSwitchTo(oldcxt);
- return pubactions;
+ return invalid_rfcolnum;
+}
+
+/*
+ * Get publication actions for the given relation.
+ */
+struct PublicationActions *
+GetRelationPublicationActions(Relation relation)
+{
+ PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+
+ /*
+ * If not publishable, it publishes no actions. (pgoutput_change() will
+ * ignore it.)
+ */
+ if (!is_publishable_relation(relation))
+ return pubactions;
+
+ if (!relation->rd_pubactions)
+ (void) RelationGetInvalidRowFilterCol(relation);
+
+ return memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
}
/*
@@ -6163,6 +6360,7 @@ load_relcache_init_file(bool shared)
rel->rd_idattr = NULL;
rel->rd_hotblockingattr = NULL;
rel->rd_pubactions = NULL;
+ rel->rd_rfcol_valid = false;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index d58ae6a..ee17908 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,7 +80,6 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
- EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 3128127..27cec81 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -164,6 +164,13 @@ typedef struct RelationData
PublicationActions *rd_pubactions; /* publication actions */
/*
+ * true if the columns referenced in row filters from all the publications
+ * the relation is in are part of replica identity, or the publication
+ * actions do not include UPDATE and DELETE.
+ */
+ bool rd_rfcol_valid;
+
+ /*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
* defaults".
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 82316bb..25c759f 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern AttrNumber RelationGetInvalidRowFilterCol(Relation relation);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5a49003..d5bb70b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -243,18 +243,21 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tbl6 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -264,7 +267,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -275,7 +278,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
@@ -286,7 +289,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
@@ -308,43 +311,43 @@ Publications:
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (e < 999)
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
- "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+ "testpub_rf_schema1.testpub_rf_tbl5" WHERE (h < 999)
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
ERROR: syntax error at or near "WHERE"
-LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+LINE 1: ...ntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a =...
^
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
ERROR: WHERE clause for schema not allowed
-LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf...
^
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
@@ -353,43 +356,185 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
-- fail - aggregate functions not allowed in WHERE clause
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
-ERROR: functions are not allowed in publication WHERE expressions
+ERROR: aggregate functions are not allowed in WHERE
LINE 1: ...TION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
^
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: User-defined operators are not allowed.
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: User-defined functions are not allowed (testpub_rf_func2).
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Non-immutable built-in functions are not allowed (random).
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: User-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Expressions only allow columns, constants and some built-in functions and operators.
+-- fail - system columns are not allowed
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE ('(0,1)'::tid = ctid);
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Cannot use system column (ctid).
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
-ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tbl6" to publication
-DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+CREATE PUBLICATION testpub6 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub6 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tbl6 WHERE (i < 99);
+ERROR: cannot add relation "testpub_rf_schema2.testpub_rf_tbl6" to publication
+DETAIL: Table's schema "testpub_rf_schema2" is already part of the publication or part of the specified schema list.
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
-DROP PUBLICATION testpub7;
+DROP PUBLICATION testpub6;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "d" used in the publication WHERE expression is not part of the replica identity.
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_part_pk_1"
+DETAIL: Column "b" used in the publication WHERE expression is not part of the replica identity.
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 47bdba8..a95c71b 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -138,12 +138,15 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tbl6 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -162,53 +165,178 @@ RESET client_min_messages;
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
RESET client_min_messages;
-- fail - aggregate functions not allowed in WHERE clause
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+-- fail - system columns are not allowed
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE ('(0,1)'::tid = ctid);
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
+CREATE PUBLICATION testpub6 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub6 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tbl6 WHERE (i < 99);
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
-DROP PUBLICATION testpub7;
+DROP PUBLICATION testpub6;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0c61ccb..89f3917 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3503,6 +3503,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
1.8.3.1
On Fri, Dec 17, 2021 at 5:29 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Dec 17, 2021 at 7:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
So using the v47 patch-set, I still find that the UPDATE above results in publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.+ * old-row (no match) new-row (no match) -> (drop change) + * old-row (no match) new row (match) -> INSERT + * old-row (match) new-row (no match) -> DELETE + * old-row (match) new row (match) -> UPDATEThis is because the old row (1,1) doesn't match the UPDATE filter "(a>1)", but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be assumed that (1,1) was never published (and thus requires an INSERT rather than UPDATE) based on these checks, because in this example, (1,1) was previously published via a different operation - INSERT (and using a different filter too).
I think the fundamental problem here is that these UPDATE rules assume that the old (current) row was previously UPDATEd (and published, or not published, according to the filter applicable to UPDATE), but this is not necessarily the case.
Or am I missing something?But it need not be correct in assuming that the old-row was part of a
previous INSERT either (and published, or not published according to
the filter applicable to an INSERT).
For example, change the sequence of inserts and updates prior to the
last update:truncate tbl1 ;
insert into tbl1 values (1,5); ==> not replicated since insert and ! (b < 2);
update tbl1 set b = 1; ==> not replicated since update and ! (a > 1)
update tbl1 set a = 2; ==> replicated and update converted to insert
since (a > 1)In this case, the last update "update tbl1 set a = 2; " is updating a
row that was previously updated and not inserted and not replicated to
the subscriber.
How does the replication logic differentiate between these two cases,
and decide if the update was previously published or not?
I think it's futile for the publisher side to try and figure out the
history of published rows. In fact, if this level of logic is required
then it is best implemented on the subscriber side, which then defeats
the purpose of a publication filter.I think it's a concern, for such a basic example with only one row,
getting unpredictable (and even wrong) replication results, depending
upon the order of operations.
I am not sure how we can deduce that. The results are based on current
and new values of row which is what I think we are expecting here.
Doesn't this problem result from allowing different WHERE clauses for
different pubactions for the same table?
My current thoughts are that this shouldn't be allowed, and also WHERE
clauses for INSERTs should, like UPDATE and DELETE, be restricted to
using only columns covered by the replica identity or primary key.
Hmm, even if we do that one could have removed the insert row filter
by the time we are evaluating the update. So, we will get the same
result. I think the behavior in your example is as we expect as per
the specs defined by the patch and I don't see any problem, in this
case, w.r.t replication results. Let us see what others think on this?
--
With Regards,
Amit Kapila.
On Sat, Dec 18, 2021 at 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Dec 17, 2021 at 5:29 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Dec 17, 2021 at 7:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
So using the v47 patch-set, I still find that the UPDATE above results in publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.+ * old-row (no match) new-row (no match) -> (drop change) + * old-row (no match) new row (match) -> INSERT + * old-row (match) new-row (no match) -> DELETE + * old-row (match) new row (match) -> UPDATEThis is because the old row (1,1) doesn't match the UPDATE filter "(a>1)", but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be assumed that (1,1) was never published (and thus requires an INSERT rather than UPDATE) based on these checks, because in this example, (1,1) was previously published via a different operation - INSERT (and using a different filter too).
I think the fundamental problem here is that these UPDATE rules assume that the old (current) row was previously UPDATEd (and published, or not published, according to the filter applicable to UPDATE), but this is not necessarily the case.
Or am I missing something?But it need not be correct in assuming that the old-row was part of a
previous INSERT either (and published, or not published according to
the filter applicable to an INSERT).
For example, change the sequence of inserts and updates prior to the
last update:truncate tbl1 ;
insert into tbl1 values (1,5); ==> not replicated since insert and ! (b < 2);
update tbl1 set b = 1; ==> not replicated since update and ! (a > 1)
update tbl1 set a = 2; ==> replicated and update converted to insert
since (a > 1)In this case, the last update "update tbl1 set a = 2; " is updating a
row that was previously updated and not inserted and not replicated to
the subscriber.
How does the replication logic differentiate between these two cases,
and decide if the update was previously published or not?
I think it's futile for the publisher side to try and figure out the
history of published rows. In fact, if this level of logic is required
then it is best implemented on the subscriber side, which then defeats
the purpose of a publication filter.I think it's a concern, for such a basic example with only one row,
getting unpredictable (and even wrong) replication results, depending
upon the order of operations.I am not sure how we can deduce that. The results are based on current
and new values of row which is what I think we are expecting here.Doesn't this problem result from allowing different WHERE clauses for
different pubactions for the same table?
My current thoughts are that this shouldn't be allowed, and also WHERE
clauses for INSERTs should, like UPDATE and DELETE, be restricted to
using only columns covered by the replica identity or primary key.Hmm, even if we do that one could have removed the insert row filter
by the time we are evaluating the update. So, we will get the same
result. I think the behavior in your example is as we expect as per
the specs defined by the patch and I don't see any problem, in this
case, w.r.t replication results. Let us see what others think on this?
I think currently there could be a problem with user perceptions. IMO
a user would be mostly interested in predictability and getting
results that are intuitive.
So, even if all strange results can (after careful examination) be
after-the-fact explained away as being "correct" according to a spec,
I don't think that is going to make any difference. e.g. regardless of
correctness, even if it just "appeared" to give unexpected results
then a user may just decide that row-filtering is not worth their
confusion...
Perhaps there is a slightly dumbed-down RF design that can still be
useful, but which can give much more comfort to the user because the
replica will be more like what they were expecting?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Sat, Dec 18, 2021 at 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think it's a concern, for such a basic example with only one row,
getting unpredictable (and even wrong) replication results, depending
upon the order of operations.I am not sure how we can deduce that. The results are based on current
and new values of row which is what I think we are expecting here.
In the two simple cases presented, the publisher ends up with the same
single row (2,1) in both cases, but in one of the cases the subscriber
ends up with an extra row (1,1) that the publisher doesn't have. So,
in using a "filter", a new row has been published that the publisher
doesn't have. I'm not so sure a user would be expecting that. Not to
mention that if (1,1) is subsequently INSERTed on the publisher side,
it will result in a duplicate key error on the publisher.
Doesn't this problem result from allowing different WHERE clauses for
different pubactions for the same table?
My current thoughts are that this shouldn't be allowed, and also WHERE
clauses for INSERTs should, like UPDATE and DELETE, be restricted to
using only columns covered by the replica identity or primary key.Hmm, even if we do that one could have removed the insert row filter
by the time we are evaluating the update. So, we will get the same
result. I think the behavior in your example is as we expect as per
the specs defined by the patch and I don't see any problem, in this
case, w.r.t replication results. Let us see what others think on this?
Here I'm talking about the typical use-case of setting the
row-filtering WHERE clause up-front and not changing it thereafter.
I think that dynamically changing filters after INSERT/UPDATE/DELETE
operations is not the typical use-case, and IMHO it's another thing
entirely (could result in all kinds of unpredictable, random results).
Personally I think it would make more sense to:
1) Disallow different WHERE clauses on the same table, for different pubactions.
2) If only INSERTs are being published, allow any column in the WHERE
clause, otherwise (as for UPDATE and DELETE) restrict the referenced
columns to be part of the replica identity or primary key.
Regards,
Greg Nancarrow
Fujitsu Australia
-----Original Message-----
From: Amit Kapila <amit.kapila16@gmail.com>
On Saturday, December 18, 2021 10:33 AM
On Fri, Dec 17, 2021 at 5:29 PM Greg Nancarrow <gregn4422@gmail.com>
wrote:On Fri, Dec 17, 2021 at 7:20 PM Ajin Cherian <itsajin@gmail.com> wrote:
On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow <gregn4422@gmail.com>
wrote:
So using the v47 patch-set, I still find that the UPDATE above results in
publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1).
This is according to the 2nd UPDATE rule below, from patch 0003.
+ * old-row (no match) new-row (no match) -> (drop change) + * old-row (no match) new row (match) -> INSERT + * old-row (match) new-row (no match) -> DELETE + * old-row (match) new row (match) -> UPDATEThis is because the old row (1,1) doesn't match the UPDATE filter "(a>1)",
but the new row (2,1) does.
This functionality doesn't seem right to me. I don't think it can be
assumed that (1,1) was never published (and thus requires an INSERT rather than
UPDATE) based on these checks, because in this example, (1,1) was previously
published via a different operation - INSERT (and using a different filter too).I think the fundamental problem here is that these UPDATE rules assume
that the old (current) row was previously UPDATEd (and published, or not
published, according to the filter applicable to UPDATE), but this is not
necessarily the case.Or am I missing something?
But it need not be correct in assuming that the old-row was part of
a previous INSERT either (and published, or not published according
to the filter applicable to an INSERT).
For example, change the sequence of inserts and updates prior to the
last update:truncate tbl1 ;
insert into tbl1 values (1,5); ==> not replicated since insert and !
(b < 2); update tbl1 set b = 1; ==> not replicated since update and
! (a > 1) update tbl1 set a = 2; ==> replicated and update converted
to insert since (a > 1)In this case, the last update "update tbl1 set a = 2; " is updating
a row that was previously updated and not inserted and not
replicated to the subscriber.
How does the replication logic differentiate between these two
cases, and decide if the update was previously published or not?
I think it's futile for the publisher side to try and figure out the
history of published rows. In fact, if this level of logic is
required then it is best implemented on the subscriber side, which
then defeats the purpose of a publication filter.I think it's a concern, for such a basic example with only one row,
getting unpredictable (and even wrong) replication results, depending
upon the order of operations.I am not sure how we can deduce that. The results are based on current and
new values of row which is what I think we are expecting here.Doesn't this problem result from allowing different WHERE clauses for
different pubactions for the same table?
My current thoughts are that this shouldn't be allowed, and also WHERE
clauses for INSERTs should, like UPDATE and DELETE, be restricted to
using only columns covered by the replica identity or primary key.Hmm, even if we do that one could have removed the insert row filter by the
time we are evaluating the update. So, we will get the same result. I think the
behavior in your example is as we expect as per the specs defined by the patch
and I don't see any problem, in this case, w.r.t replication results. Let us see
what others think on this?
I think it might not be hard to predict the current behavior. User only need to be
aware of that:
1) pubaction and row filter on different publications are combined with 'OR'.
2) FOR UPDATE, we execute the fiter for both OLD and NEW tuple and would change
the operation type accordingly.
For the example mentioned:
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
If we follow the rule 1) and 2), I feel we are able to predict the following
conditions:
--
WHERE (action = 'insert' AND b < 2) OR (action = 'update' AND a > 1)
--
So, it seems acceptable to me.
Personally, I think the current design could give user more flexibility to
handle some complex scenario. If user want some simple setting for publication,
they can also set same row filter for the same table in different publications.
To avoid confusion, I think we can document about these rules clearly.
BTW, From the document of IBM, I think IBM also support this kind of complex
condition [1]https://www.ibm.com/docs/en/idr/11.4.0?topic=rows-log-record-variables.
[1]: https://www.ibm.com/docs/en/idr/11.4.0?topic=rows-log-record-variables
Best regards,
Hou zj
On Fri, Dec 17, 2021 6:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Dec 17, 2021 at 4:11 AM Peter Smith <smithpb2250@gmail.com> wrote:
PSA the v47* patch set.
Few comments on v47-0002:
=======================
1. The handling to find rowfilter for ancestors in
RelationGetInvalidRowFilterCol seems complex. It seems you are
accumulating non-partition relations as well in toprelid_in_pub. Can
we simplify such that we find the ancestor only for 'pubviaroot'
publications?2. I think the name RelationGetInvalidRowFilterCol is confusing
because the same function is also used to get publication actions. Can
we name it as GetRelationPublicationInfo() and pass a bool parameter
to indicate whether row_filter info needs to be built. We can get the
invalid_row_filter column as output from that function.3. +GetRelationPublicationActions(Relation relation) { .. + if (!relation->rd_pubactions) + (void) RelationGetInvalidRowFilterCol(relation); + + return memcpy(pubactions, relation->rd_pubactions, + sizeof(PublicationActions)); .. .. }I think here we can reverse the check such that if actions are set
just do memcpy and return otherwise get the relationpublicationactions
info.4.
invalid_rowfilter_column_walker
{
../*
* If pubviaroot is true, we need to convert the column number of
* parent to the column number of child relation first.
*/
if (context->pubviaroot)
{
char *colname = get_attname(context->parentid, attnum, false);
attnum = get_attnum(context->relid, colname);
}Here, in the comments, you can tell why you need this conversion. Can
we name this function as rowfilter_column_walker()?5. +/* For invalid_rowfilter_column_walker. */ +typedef struct { + AttrNumber invalid_rfcolnum; /* invalid column number */ + Bitmapset *bms_replident; /* bitset of replica identity col indexes */ + bool pubviaroot; /* true if we are validating the parent + * relation's row filter */ + Oid relid; /* relid of the relation */ + Oid parentid; /* relid of the parent relation */ +} rf_context;Normally, we declare structs at the beginning of the file and for the
formatting of struct declarations, see other nearby structs like
RelIdCacheEnt.6. Can we name IsRowFilterSimpleNode() as IsRowFilterSimpleExpr()?
Thanks for the comments, I agree with all the comments.
Attach the V49 patch set, which addressed all the above comments on the 0002
patch.
Best regards,
Hou zj
Attachments:
v49-0001-Row-filter-for-logical-replication.patchapplication/octet-stream; name=v49-0001-Row-filter-for-logical-replication.patchDownload
From 3ba15c0648d6d9504533bcfc872a560d2a7ddcdd Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 18:59:42 +1100
Subject: [PATCH v49] Row-filter for logical replication.
This feature adds row filtering for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row-filter is per table. A new row-filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.
The row-filter WHERE clause for a table added to a publication that publishes
UPDATE and/or DELETE operations must contain only columns that are covered by
REPLICA IDENTITY. The row-filter WHERE clause for a table added to a publication
that publishes INSERT can use any column. If the row-filter evaluates to NULL,
it returns false. The WHERE clause allows simple expressions. Simple expressions
cannot contain any aggregate or window functions, non-immutable functions,
user-defined types, operators or functions. This restriction could possibly be
addressed in the future.
If you choose to do the initial table synchronization, only data that satisfies
the row-filters is pulled by the subscriber. If the subscription has several
publications in which a table has been published with different WHERE clauses,
rows which satisfy ANY of the expressions will be copied. If a subscriber is a
pre-15 version, the initial table synchronization won't use row-filters even
if they are defined in the publisher.
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row-filter (if
the parameter is false, the default) or the root partitioned table row-filter.
Psql commands \dRp+ and \d+ will display any row-filters.
Author: Euler Taveira, Peter Smith
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
Combining multiple row-filters
==============================
The subscription is treated "as a union of all the publications" [1], so the
row-filters are combined with OR.
If the subscription has several publications in which the same table has been
published with different filters, those expressions get OR'ed together so that
rows satisfying any of the expressions will be replicated.
Notice this means if one of the publications has no filter at all then all other
filters become redundant.
Author: Peter Smith
[1] https://www.postgresql.org/message-id/574b4e78-2f35-acf3-4bdc-4b872582e739%40enterprisedb.com
Row-filter caching
==================
The cached row-filters (e.g. ExprState *) are invalidated only in function
rel_sync_cache_relation_cb, so it means the ALTER PUBLICATION for one table
will not cause row-filters of other tables to also become invalidated.
The code related to caching row-filters is done just before they are needed
(in the pgoutput_row_filter function).
If there are multiple publication row-filters for a given table these are all
combined/flattened into a single filter.
Author: Peter Smith, Greg Nancarrow
The filter caching is based on a suggestions from Amit [1] [2], and Houz [3]
[1] https://www.postgresql.org/message-id/CAA4eK1%2BxQb06NGs6Y7OzwMtKYYixEqR8tdWV5THAVE4SAqNrDg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1%2Btio46goUKBUfAKFsFVxtgk8nOty%3DTxKoKH-gdLzHD2g%40mail.gmail.com
[3] https://www.postgresql.org/message-id/OS0PR01MB5716090A70A73ADF58C58950948D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Cache ExprState per pubaction.
If a subscriber has multiple publications and these publications include the
same table then there can be multiple filters that apply to that table.
These filters are stored per-pubactions of the publications. There are 4 kinds
of pubaction ("insert", "update", "delete", "truncate"), but row-filters are
not applied for "truncate".
Filters for the same pubaction are all combined (OR'ed) and cached as one, so
at the end there are at most 3 cached filters per table.
The appropriate (pubaction) filter is executed according to the DML operation.
Author: Peter Smith
Discussion: https://www.postgresql.org/message-id/CAA4eK1%2BhVXfOSScbf5LUB%3D5is%3DwYaC6NBhLxuvetbWQnZRnsVQ%40mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 8 +
doc/src/sgml/ref/alter_publication.sgml | 13 +-
doc/src/sgml/ref/create_publication.sgml | 37 ++-
doc/src/sgml/ref/create_subscription.sgml | 24 +-
src/backend/catalog/pg_publication.c | 69 ++++-
src/backend/commands/publicationcmds.c | 108 +++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/parser/gram.y | 38 ++-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 21 +-
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_oper.c | 7 +
src/backend/parser/parse_relation.c | 9 +
src/backend/replication/logical/tablesync.c | 118 +++++++-
src/backend/replication/pgoutput/pgoutput.c | 410 +++++++++++++++++++++++++++-
src/bin/psql/describe.c | 26 +-
src/include/catalog/pg_publication.h | 7 +-
src/include/catalog/pg_publication_rel.h | 6 +
src/include/nodes/parsenodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/test/regress/expected/publication.out | 151 ++++++++++
src/test/regress/sql/publication.sql | 76 ++++++
src/test/subscription/t/027_row_filter.pl | 357 ++++++++++++++++++++++++
24 files changed, 1451 insertions(+), 51 deletions(-)
create mode 100644 src/test/subscription/t/027_row_filter.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537..2f1f913 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6311,6 +6311,14 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
Reference to relation
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>prqual</structfield> <type>pg_node_tree</type>
+ </para>
+ <para>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index bb4ef5e..5d9869c 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -30,7 +30,7 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -52,7 +52,9 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
remove one or more tables/schemas from the publication. Note that adding
tables/schemas to a publication that is already subscribed to will require a
<literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal> action on the
- subscribing side in order to become effective.
+ subscribing side in order to become effective. Note also that the combination
+ of <literal>DROP</literal> with <literal>WHERE</literal> clause is not
+ allowed.
</para>
<para>
@@ -109,7 +111,12 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
table name, only that table is affected. If <literal>ONLY</literal> is not
specified, the table and all its descendant tables (if any) are
affected. Optionally, <literal>*</literal> can be specified after the table
- name to explicitly indicate that descendant tables are included.
+ name to explicitly indicate that descendant tables are included. If the
+ optional <literal>WHERE</literal> clause is specified, rows that do not
+ satisfy the <replaceable class="parameter">expression</replaceable> will
+ not be published. Note that parentheses are required around the
+ expression. The <replaceable class="parameter">expression</replaceable>
+ is evaluated with the role used for the replication connection.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index d805e8e..5aeee23 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -28,7 +28,7 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<phrase>where <replaceable class="parameter">publication_object</replaceable> is one of:</phrase>
- TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]
+ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ WHERE ( <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ... ]
</synopsis>
</refsynopsisdiv>
@@ -79,6 +79,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ If the optional <literal>WHERE</literal> clause is specified, only rows
+ that satisfy the <replaceable class="parameter">expression</replaceable>
+ will be published. Note that parentheses are required around the
+ expression. It has no effect on <literal>TRUNCATE</literal> commands.
+ </para>
+
+ <para>
Only persistent base tables and partitioned tables can be part of a
publication. Temporary tables, unlogged tables, foreign tables,
materialized views, and regular views cannot be part of a publication.
@@ -226,6 +233,22 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
</para>
<para>
+ A <literal>WHERE</literal> clause must contain only columns that are
+ part of the primary key or are covered by the <literal>REPLICA
+ IDENTITY</literal>, in order for <command>UPDATE</command> and
+ <command>DELETE</command> operations to be published.
+ For publication of <command>INSERT</command> operations, any column
+ may be used in the <literal>WHERE</literal> clause.
+ If nullable columns are present in the <literal>WHERE</literal> clause,
+ possible NULL values should be accounted for in expressions, to avoid
+ unexpected results, because <literal>NULL</literal> values can cause
+ those expressions to evaluate to false.
+ A <literal>WHERE</literal> clause allows simple expressions. The simple
+ expression cannot contain any aggregate or window functions, non-immutable
+ functions, user-defined types, operators or functions.
+ </para>
+
+ <para>
For an <command>INSERT ... ON CONFLICT</command> command, the publication will
publish the operation that actually results from the command. So depending
of the outcome, it may be published as either <command>INSERT</command> or
@@ -247,6 +270,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
<para>
<acronym>DDL</acronym> operations are not published.
</para>
+
+ <para>
+ The <literal>WHERE</literal> clause expression is executed with the role used
+ for the replication connection.
+ </para>
</refsect1>
<refsect1>
@@ -260,6 +288,13 @@ CREATE PUBLICATION mypublication FOR TABLE users, departments;
</para>
<para>
+ Create a publication that publishes all changes from active departments:
+<programlisting>
+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS TRUE);
+</programlisting>
+ </para>
+
+ <para>
Create a publication that publishes all changes in all tables:
<programlisting>
CREATE PUBLICATION alltables FOR ALL TABLES;
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f..db255f3 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -208,6 +208,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
that are being subscribed to when the replication starts.
The default is <literal>true</literal>.
</para>
+ <para>
+ If the publications contain conditional expressions, it will affect
+ what data is copied. Refer to the
+ <xref linkend="sql-createsubscription-notes" /> for details.
+ </para>
</listitem>
</varlistentry>
@@ -293,7 +298,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</variablelist>
</refsect1>
- <refsect1>
+ <refsect1 id="sql-createsubscription-notes" xreflabel="Notes">
<title>Notes</title>
<para>
@@ -319,6 +324,23 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
the parameter <literal>create_slot = false</literal>. This is an
implementation restriction that might be lifted in a future release.
</para>
+
+ <para>
+ If any table in the publication has a <literal>WHERE</literal> clause, rows
+ that do not satisfy the <replaceable class="parameter">expression</replaceable>
+ will not be published (i.e. they will be filtered out).
+ If the subscription has several publications in which the same table has been
+ published with different <literal>WHERE</literal> clauses, those expressions
+ (for the same publish operation) get OR'ed together so that rows satisfying any
+ of the expressions will be published. Also, if one of the publications for the
+ same table has no <literal>WHERE</literal> clause at all, or is a <literal>FOR
+ ALL TABLES</literal> or <literal>FOR ALL TABLES IN SCHEMA</literal> publication,
+ then all other <literal>WHERE</literal> clauses (for the same publish operation)
+ become redundant.
+ If the subscriber is a <productname>PostgreSQL</productname> version before 15
+ then any row filtering is ignored during the initial data synchronization phase.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 62f10bc..0929aa0 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -36,6 +36,9 @@
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_relation.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
@@ -276,21 +279,54 @@ GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
}
/*
+ * Transform a publication WHERE clause, ensuring it is coerced to boolean and
+ * necessary collation information is added if required, and add a new
+ * nsitem/RTE for the associated relation to the ParseState's namespace list.
+ */
+Node *
+GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
+ bool fixup_collation)
+{
+ ParseNamespaceItem *nsitem;
+ Node *whereclause = NULL;
+
+ pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+ nsitem = addRangeTableEntryForRelation(pstate, pri->relation,
+ AccessShareLock, NULL, false, false);
+
+ addNSItemToQuery(pstate, nsitem, false, true, true);
+
+ whereclause = transformWhereClause(pstate, copyObject(pri->whereClause),
+ EXPR_KIND_PUBLICATION_WHERE,
+ "PUBLICATION WHERE");
+
+ /* Fix up collation information */
+ if (fixup_collation)
+ assign_expr_collations(pstate, whereclause);
+
+ return whereclause;
+}
+
+/*
* Insert new publication / relation mapping.
*/
ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists)
{
Relation rel;
HeapTuple tup;
Datum values[Natts_pg_publication_rel];
bool nulls[Natts_pg_publication_rel];
- Oid relid = RelationGetRelid(targetrel->relation);
+ Relation targetrel = pri->relation;
+ Oid relid = RelationGetRelid(targetrel);
Oid prrelid;
Publication *pub = GetPublication(pubid);
ObjectAddress myself,
referenced;
+ ParseState *pstate;
+ Node *whereclause = NULL;
List *relids = NIL;
rel = table_open(PublicationRelRelationId, RowExclusiveLock);
@@ -311,10 +347,22 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("relation \"%s\" is already member of publication \"%s\"",
- RelationGetRelationName(targetrel->relation), pub->name)));
+ RelationGetRelationName(targetrel), pub->name)));
}
- check_publication_add_relation(targetrel->relation);
+ check_publication_add_relation(targetrel);
+
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /*
+ * Get the transformed WHERE clause, of boolean type, with necessary
+ * collation information.
+ */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }
/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -328,6 +376,12 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);
+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] = CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+
tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
/* Insert tuple into catalog. */
@@ -344,6 +398,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
ObjectAddressSet(referenced, RelationRelationId, relid);
recordDependencyOn(&myself, &referenced, DEPENDENCY_AUTO);
+ /* Add dependency on the objects mentioned in the qualifications */
+ if (whereclause)
+ {
+ recordDependencyOnExpr(&myself, whereclause, pstate->p_rtable, DEPENDENCY_NORMAL);
+ free_parsestate(pstate);
+ }
+
/* Close the table. */
table_close(rel, RowExclusiveLock);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 404bb5d..9ca743c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -529,40 +529,96 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
List *delrels = NIL;
ListCell *oldlc;
+ /*
+ * Check if the relation is member of the existing schema in the
+ * publication or member of the schema list specified.
+ */
CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
PUBLICATIONOBJ_TABLE);
- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication,
+ * look for existing relations that need not be dropped.
+ */
foreach(oldlc, oldrelids)
{
Oid oldrelid = lfirst_oid(oldlc);
- ListCell *newlc;
+ ListCell *newlc;
+ PublicationRelInfo *oldrel;
bool found = false;
+ HeapTuple rftuple;
+ bool rfisnull = true;
+ Node *oldrelwhereclause = NULL;
+
+ /* look up the cache for the old relmap */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(oldrelid),
+ ObjectIdGetDatum(pubid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum whereClauseDatum;
+
+ whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual,
+ &rfisnull);
+ if (!rfisnull)
+ oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+
+ ReleaseSysCache(rftuple);
+ }
foreach(newlc, rels)
{
PublicationRelInfo *newpubrel;
newpubrel = (PublicationRelInfo *) lfirst(newlc);
+
+ /*
+ * Look if any of the new set of relations match with
+ * the existing relations in the publication. Additionally,
+ * if the relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
if (RelationGetRelid(newpubrel->relation) == oldrelid)
{
- found = true;
- break;
+ if (rfisnull && !newpubrel->whereClause)
+ {
+ found = true;
+ break;
+ }
+
+ if (!rfisnull && newpubrel->whereClause)
+ {
+ ParseState *pstate = make_parsestate(NULL);
+ Node *whereclause;
+
+ whereclause = GetTransformedWhereClause(pstate,
+ newpubrel,
+ false);
+ if (equal(oldrelwhereclause, whereclause))
+ {
+ free_parsestate(pstate);
+ found = true;
+ break;
+ }
+
+ free_parsestate(pstate);
+ }
}
}
- /* Not yet in the list, open it and add to the list */
- if (!found)
- {
- Relation oldrel;
- PublicationRelInfo *pubrel;
-
- /* Wrap relation into PublicationRelInfo */
- oldrel = table_open(oldrelid, ShareUpdateExclusiveLock);
- pubrel = palloc(sizeof(PublicationRelInfo));
- pubrel->relation = oldrel;
+ if (oldrelwhereclause)
+ pfree(oldrelwhereclause);
- delrels = lappend(delrels, pubrel);
+ /*
+ * Add the non-matched relations to a list so that they can
+ * be dropped.
+ */
+ if (!found)
+ {
+ oldrel = palloc(sizeof(PublicationRelInfo));
+ oldrel->whereClause = NULL;
+ oldrel->relation = table_open(oldrelid,
+ ShareUpdateExclusiveLock);
+ delrels = lappend(delrels, oldrel);
}
}
@@ -899,6 +955,7 @@ OpenTableList(List *tables)
List *relids = NIL;
List *rels = NIL;
ListCell *lc;
+ List *relids_with_rf = NIL;
/*
* Open, share-lock, and check all the explicitly-specified relations
@@ -926,15 +983,26 @@ OpenTableList(List *tables)
*/
if (list_member_oid(relids, myrelid))
{
+ /* Disallow duplicate tables if there are any with row-filters. */
+ if (t->whereClause || list_member_oid(relids_with_rf, myrelid))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("conflicting or redundant row-filters for \"%s\"",
+ RelationGetRelationName(rel))));
+
table_close(rel, ShareUpdateExclusiveLock);
continue;
}
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, myrelid);
+ if (t->whereClause)
+ relids_with_rf = lappend_oid(relids_with_rf, myrelid);
+
/*
* Add children of this rel, if requested, so that they too are added
* to the publication. A partitioned table can't have any inheritance
@@ -967,6 +1035,8 @@ OpenTableList(List *tables)
rel = table_open(childrelid, NoLock);
pub_rel = palloc(sizeof(PublicationRelInfo));
pub_rel->relation = rel;
+ /* child inherits WHERE clause from parent */
+ pub_rel->whereClause = t->whereClause;
rels = lappend(rels, pub_rel);
relids = lappend_oid(relids, childrelid);
}
@@ -974,6 +1044,7 @@ OpenTableList(List *tables)
}
list_free(relids);
+ list_free(relids_with_rf);
return rels;
}
@@ -993,6 +1064,8 @@ CloseTableList(List *rels)
pub_rel = (PublicationRelInfo *) lfirst(lc);
table_close(pub_rel->relation, NoLock);
}
+
+ list_free_deep(rels);
}
/*
@@ -1088,6 +1161,11 @@ PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
RelationGetRelationName(rel))));
}
+ if (pubrel->whereClause)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE")));
+
ObjectAddressSet(obj, PublicationRelRelationId, prid);
performDeletion(&obj, DROP_CASCADE, 0);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df0b747..bd55ea6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4833,6 +4833,7 @@ _copyPublicationTable(const PublicationTable *from)
PublicationTable *newnode = makeNode(PublicationTable);
COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(whereClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index cb7ddd4..028b8e5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2312,6 +2312,7 @@ static bool
_equalPublicationTable(const PublicationTable *a, const PublicationTable *b)
{
COMPARE_NODE_FIELD(relation);
+ COMPARE_NODE_FIELD(whereClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3d4dd43..9da93a0 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9742,12 +9742,13 @@ CreatePublicationStmt:
* relation_expr here.
*/
PublicationObjSpec:
- TABLE relation_expr
+ TABLE relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_TABLE;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $2;
+ $$->pubtable->whereClause = $3;
}
| ALL TABLES IN_P SCHEMA ColId
{
@@ -9762,28 +9763,45 @@ PublicationObjSpec:
$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA;
$$->location = @5;
}
- | ColId
+ | ColId OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
- $$->name = $1;
+ if ($2)
+ {
+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = makeRangeVar(NULL, $1, @1);
+ $$->pubtable->whereClause = $2;
+ }
+ else
+ {
+ $$->name = $1;
+ }
$$->location = @1;
}
- | ColId indirection
+ | ColId indirection OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = makeRangeVarFromQualifiedName($1, $2, @1, yyscanner);
+ $$->pubtable->whereClause = $3;
$$->location = @1;
}
/* grammar like tablename * , ONLY tablename, ONLY ( tablename ) */
- | extended_relation_expr
+ | extended_relation_expr OptWhereClause
{
$$ = makeNode(PublicationObjSpec);
$$->pubobjtype = PUBLICATIONOBJ_CONTINUATION;
$$->pubtable = makeNode(PublicationTable);
$$->pubtable->relation = $1;
+ $$->pubtable->whereClause = $2;
}
| CURRENT_SCHEMA
{
@@ -17431,7 +17449,8 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid table name at or near"),
parser_errposition(pubobj->location));
- else if (pubobj->name)
+
+ if (pubobj->name)
{
/* convert it to PublicationTable */
PublicationTable *pubtable = makeNode(PublicationTable);
@@ -17444,6 +17463,13 @@ preprocess_pubobj_list(List *pubobjspec_list, core_yyscan_t yyscanner)
else if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA ||
pubobj->pubobjtype == PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA)
{
+ /* WHERE clause is not allowed on a schema object */
+ if (pubobj->pubtable && pubobj->pubtable->whereClause)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHERE clause for schema not allowed"),
+ parser_errposition(pubobj->location));
+
/*
* We can distinguish between the different type of schema
* objects based on whether name and pubtable is set.
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7d829a0..193c87d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,6 +551,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in publication WHERE expressions");
+ else
+ err = _("grouping operations are not allowed in publication WHERE expressions");
+
+ break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -943,6 +950,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("window functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 2d1a477..3d43839 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,8 +200,19 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
+ {
+ /*
+ * Forbid functions in publication WHERE condition
+ */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("functions are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, exprLocation(expr))));
+
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
+ }
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -504,6 +515,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
+ case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1764,6 +1776,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("cannot use subquery in publication WHERE expression");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3084,6 +3099,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
+ case EXPR_KIND_PUBLICATION_WHERE:
+ return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 542f916..29bebb7 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,6 +2655,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
+ case EXPR_KIND_PUBLICATION_WHERE:
+ err = _("set-returning functions are not allowed in publication WHERE expressions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index bc34a23..29f8835 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,6 +718,13 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
+ /* Check it's not a custom operator for publication WHERE expressions */
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("user-defined operators are not allowed in publication WHERE expressions"),
+ parser_errposition(pstate, location)));
+
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index c5c3f26..036d9c6 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,11 +3538,20 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
+ {
+ if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_TABLE),
+ errmsg("publication WHERE expression invalid reference to table \"%s\"",
+ relation->relname),
+ parser_errposition(pstate, relation->location)));
+
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
+ }
}
/*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a..c20c221 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -687,20 +687,24 @@ copy_read_data(void *outbuf, int minread, int maxread)
/*
- * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * Get information about a remote relation, in a similar fashion to how the
+ * RELATION message provides information during replication. This function also
+ * returns the relation qualifications to be used in the COPY command.
*/
static void
fetch_remote_table_info(char *nspname, char *relname,
- LogicalRepRelation *lrel)
+ LogicalRepRelation *lrel, List **qual)
{
WalRcvExecResult *res;
StringInfoData cmd;
TupleTableSlot *slot;
Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
Oid attrRow[] = {TEXTOID, OIDOID, BOOLOID};
+ Oid qualRow[] = {TEXTOID};
bool isnull;
int natt;
+ ListCell *lc;
+ bool first;
lrel->nspname = nspname;
lrel->relname = relname;
@@ -796,6 +800,80 @@ fetch_remote_table_info(char *nspname, char *relname,
lrel->natts = natt;
walrcv_clear_result(res);
+
+ /*
+ * Get relation qual. DISTINCT avoids the same expression of a table in
+ * multiple publications from being included multiple times in the final
+ * expression.
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
+ {
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr "
+ " ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u "
+ " AND p.pubname IN (", lrel->remoteid);
+
+ first = true;
+ foreach(lc, MySubscription->publications)
+ {
+ char *pubname = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, ", ");
+
+ appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ }
+ appendStringInfoChar(&cmd, ')');
+
+ res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch relation qualifications for table \"%s.%s\" from publisher: %s",
+ nspname, relname, res->err)));
+
+ /*
+ * Multiple row filter expressions for the same table will be combined
+ * by COPY using OR. If any of the filter expressions for this table are
+ * null, it means the whole table will be copied. In this case it is not
+ * necessary to construct a unified row filter expression at all.
+ */
+ slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
+ {
+ Datum rf = slot_getattr(slot, 1, &isnull);
+
+ if (!isnull)
+ *qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
+
+ ExecClearTuple(slot);
+
+ /*
+ * One entry without a row filter expression means clean up
+ * previous expressions (if there are any) and return with no
+ * expressions.
+ */
+ if (isnull)
+ {
+ if (*qual)
+ {
+ list_free_deep(*qual);
+ *qual = NIL;
+ }
+ break;
+ }
+ }
+ ExecDropSingleTupleTableSlot(slot);
+
+ walrcv_clear_result(res);
+ }
+
pfree(cmd.data);
}
@@ -809,6 +887,7 @@ copy_table(Relation rel)
{
LogicalRepRelMapEntry *relmapentry;
LogicalRepRelation lrel;
+ List *qual = NIL;
WalRcvExecResult *res;
StringInfoData cmd;
CopyFromState cstate;
@@ -817,7 +896,7 @@ copy_table(Relation rel)
/* Get the publisher relation info. */
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
- RelationGetRelationName(rel), &lrel);
+ RelationGetRelationName(rel), &lrel, &qual);
/* Put the relation into relmap. */
logicalrep_relmap_update(&lrel);
@@ -828,14 +907,18 @@ copy_table(Relation rel)
/* Start copy on the publisher. */
initStringInfo(&cmd);
- if (lrel.relkind == RELKIND_RELATION)
+
+ /* Regular table with no row filter */
+ if (lrel.relkind == RELKIND_RELATION && qual == NIL)
appendStringInfo(&cmd, "COPY %s TO STDOUT",
quote_qualified_identifier(lrel.nspname, lrel.relname));
else
{
/*
* For non-tables, we need to do COPY (SELECT ...), but we can't just
- * do SELECT * because we need to not copy generated columns.
+ * do SELECT * because we need to not copy generated columns. For
+ * tables with any row filters, build a SELECT query with OR'ed row
+ * filters for COPY.
*/
appendStringInfoString(&cmd, "COPY (SELECT ");
for (int i = 0; i < lrel.natts; i++)
@@ -844,8 +927,29 @@ copy_table(Relation rel)
if (i < lrel.natts - 1)
appendStringInfoString(&cmd, ", ");
}
- appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+ appendStringInfo(&cmd, " FROM %s",
quote_qualified_identifier(lrel.nspname, lrel.relname));
+ /* list of OR'ed filters */
+ if (qual != NIL)
+ {
+ ListCell *lc;
+ bool first = true;
+
+ appendStringInfoString(&cmd, " WHERE ");
+ foreach(lc, qual)
+ {
+ char *q = strVal(lfirst(lc));
+
+ if (first)
+ first = false;
+ else
+ appendStringInfoString(&cmd, " OR ");
+ appendStringInfoString(&cmd, q);
+ }
+ list_free_deep(qual);
+ }
+
+ appendStringInfoString(&cmd, ") TO STDOUT");
}
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
pfree(cmd.data);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 6f6a203..2fa08e7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -15,16 +15,24 @@
#include "access/tupconvert.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "commands/defrem.h"
+#include "executor/executor.h"
#include "fmgr.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/makefuncs.h"
+#include "optimizer/optimizer.h"
+#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
+#include "utils/builtins.h"
#include "utils/int8.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -116,6 +124,24 @@ typedef struct RelationSyncEntry
PublicationActions pubactions;
/*
+ * ExprState cannot be used to indicate no cache, invalid cache and valid
+ * cache, so the flag exprstate_valid indicates if the current cache is
+ * valid.
+ *
+ * Multiple ExprState entries might be used if there are multiple
+ * publications for a single table. Different publication actions don't
+ * allow multiple expressions to always be combined into one, so there is
+ * one ExprSTate per publication action. Only 3 publication actions are used
+ * for row filtering ("insert", "update", "delete"). The exprstate array is
+ * indexed by ReorderBufferChangeType.
+ */
+ bool exprstate_valid;
+#define IDX_PUBACTION_n 3
+ ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
+ One per publication action. */
+ TupleTableSlot *scantuple; /* tuple table slot for row filter */
+
+ /*
* OID of the relation to publish changes as. For a partition, this may
* be set to one of its ancestors whose schema will be used when
* replicating changes, if publish_via_partition_root is set for the
@@ -137,7 +163,7 @@ static HTAB *RelationSyncCache = NULL;
static void init_rel_sync_cache(MemoryContext decoding_context);
static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation relation);
static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
uint32 hashvalue);
@@ -146,6 +172,14 @@ static void set_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
TransactionId xid);
+/* row filter routines */
+static EState *create_estate_for_relation(Relation rel);
+static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
+static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple,
+ HeapTuple newtuple, RelationSyncEntry *entry);
+
/*
* Specify output plugin callbacks
*/
@@ -621,6 +655,316 @@ send_relation_and_attrs(Relation relation, TransactionId xid,
}
/*
+ * Executor state preparation for evaluation of row filter expressions for the
+ * specified relation.
+ */
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}
+
+/*
+ * Initialize for row filter expression execution.
+ */
+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
+ ExprState *exprstate;
+ Oid exprtype;
+ Expr *expr;
+
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype, BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the row filter.")));
+
+ /*
+ * This is the same code as ExecPrepareExpr() but that is not used because
+ * we have no EState to pass it. There should probably be another function
+ * in the executor to handle the execution outside a normal Plan tree
+ * context.
+ */
+ expr = expression_planner(expr);
+ exprstate = ExecInitExpr(expr, NULL);
+
+ return exprstate;
+}
+
+/*
+ * Evaluates row filter.
+ *
+ * If the row filter evaluates to NULL, it is taken as false i.e. the change
+ * isn't replicated.
+ */
+static bool
+pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
+{
+ Datum ret;
+ bool isnull;
+
+ Assert(state != NULL);
+
+ ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
+
+ elog(DEBUG3, "row filter evaluates to %s (isnull: %s)",
+ DatumGetBool(ret) ? "true" : "false",
+ isnull ? "true" : "false");
+
+ if (isnull)
+ return false;
+
+ return DatumGetBool(ret);
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
+ Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ ListCell *lc;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
+ /*
+ * If the row filter caching is currently flagged "invalid" then it means we
+ * don't know yet if there is/isn't any row filters for this relation.
+ *
+ * This code is usually one-time execution.
+ *
+ * NOTE: The ExprState cache could have been created up-front in the
+ * function get_rel_sync_entry() instead of the deferred on-the-fly
+ * assignment below. The reason for choosing to do it here is because there
+ * are some scenarios where the get_rel_sync_entry() is called but where a
+ * row will not be published. For example, for truncate, we may not need
+ * any row evaluation, so there is no need to compute it. It would also be
+ * a waste if any error happens before actually evaluating the filter. And
+ * tomorrow there could be other operations (which use get_rel_sync_entry)
+ * but which don't need to build ExprState. Furthermore, because the
+ * decision to publish or not is made AFTER the call to get_rel_sync_entry
+ * it may be that the filter evaluation is not necessary at all. So the
+ * decision was to defer this logic to last moment when we know it will be
+ * needed.
+ */
+ if (!entry->exprstate_valid)
+ {
+ MemoryContext oldctx;
+ int idx;
+ bool found_filters = false;
+ int idx_ins = REORDER_BUFFER_CHANGE_INSERT;
+ int idx_upd = REORDER_BUFFER_CHANGE_UPDATE;
+ int idx_del = REORDER_BUFFER_CHANGE_DELETE;
+
+ /*
+ * Find if there are any row filters for this relation. If there are,
+ * then prepare the necessary ExprState and cache it in entry->exprstate.
+ *
+ * NOTE: All publication-table mappings must be checked.
+ *
+ * NOTE: If the relation is a partition and pubviaroot is true, use
+ * the row filter of the topmost partitioned table instead of the row
+ * filter of its own partition.
+ *
+ * NOTE: Multiple publications might have multiple row filters for this
+ * relation. Since row filter usage depends on the DML operation,
+ * there are multiple lists (one for each operation) which row filters
+ * will be appended.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+
+ /*
+ * Lookup if there is a row-filter, and if yes remember it in a list (per
+ * pubaction). If no, then remember there was no filter for this pubaction.
+ * Code following this 'publications' loop will combine all filters.
+ */
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(entry->publish_as_relid), ObjectIdGetDatum(pub->oid));
+ if (HeapTupleIsValid(rftuple))
+ {
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple, Anum_pg_publication_rel_prqual, &rfisnull);
+
+ if (!rfisnull)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ /* Gather the rfnodes per pubaction of this publiaction. */
+ if (pub->pubactions.pubinsert)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_ins] = lappend(rfnodes[idx_ins], rfnode);
+ }
+ if (pub->pubactions.pubupdate)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_upd] = lappend(rfnodes[idx_upd], rfnode);
+ }
+ if (pub->pubactions.pubdelete)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfnodes[idx_del] = lappend(rfnodes[idx_del], rfnode);
+ }
+ MemoryContextSwitchTo(oldctx);
+ }
+ else
+ {
+ /* Remember which pubactions have no row-filter. */
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ {
+ ReleaseSysCache(rftuple);
+ break;
+ }
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+
+ } /* loop all subscribed publications */
+
+ /*
+ * Now all the filters for all pubactions are known. Combine them when
+ * their pubactions are same.
+ *
+ * All row filter expressions will be discarded if there is one
+ * publication-relation entry without a row filter. That's because all
+ * expressions are aggregated by the OR operator. The row filter absence
+ * means replicate all rows so a single valid expression means publish
+ * this row.
+ */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ int n_filters;
+
+ if (no_filter[idx])
+ {
+ if (rfnodes[idx])
+ {
+ list_free_deep(rfnodes[idx]);
+ rfnodes[idx] = NIL;
+ }
+ }
+
+ /*
+ * If there was one or more filter for this pubaction then combine them
+ * (if necessary) and cache the ExprState.
+ */
+ n_filters = list_length(rfnodes[idx]);
+ if (n_filters > 0)
+ {
+ Node *rfnode;
+
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes[idx], -1) : linitial(rfnodes[idx]);
+ entry->exprstate[idx] = pgoutput_row_filter_init_expr(rfnode);
+ MemoryContextSwitchTo(oldctx);
+
+ found_filters = true; /* flag that we will need slots made */
+ }
+ } /* for each pubaction */
+
+ if (found_filters)
+ {
+ TupleDesc tupdesc = RelationGetDescr(relation);
+
+ /*
+ * Create tuple table slots for row filter. Create a copy of the
+ * TupleDesc as it needs to live as long as the cache remains.
+ */
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+ tupdesc = CreateTupleDescCopy(tupdesc);
+ entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ MemoryContextSwitchTo(oldctx);
+ }
+
+ entry->exprstate_valid = true;
+ }
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ if (message_level_is_interesting(DEBUG3))
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid));
+
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ estate = create_estate_for_relation(relation);
+
+ /* Prepare context per tuple */
+ ecxt = GetPerTupleExprContext(estate);
+ ecxt->ecxt_scantuple = entry->scantuple;
+
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+
+ /*
+ * NOTE: Multiple publication row-filters have already been combined to a
+ * single exprstate (for this pubaction).
+ */
+ if (entry->exprstate[changetype])
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[changetype], ecxt);
+ }
+
+ /* Cleanup allocated resources */
+ ResetExprContext(ecxt);
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+}
+
+/*
* Sends the decoded DML over wire.
*
* This is called both in streaming and non-streaming modes.
@@ -647,7 +991,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (in_streaming)
xid = change->txn->xid;
- relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+ relentry = get_rel_sync_entry(data, relation);
/* First check the table filter */
switch (change->action)
@@ -671,8 +1015,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
- maybe_send_schema(ctx, change, relation, relentry);
-
/* Send the data */
switch (change->action)
{
@@ -680,6 +1022,16 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
+ break;
+
+ /*
+ * Schema should be sent before the logic that replaces the
+ * relation because it also sends the ancestor's relation.
+ */
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -703,6 +1055,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -731,6 +1089,12 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
+ break;
+
+ maybe_send_schema(ctx, change, relation, relentry);
+
/* Switch relation if publishing via root. */
if (relentry->publish_as_relid != RelationGetRelid(relation))
{
@@ -794,7 +1158,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (!is_publishable_relation(relation))
continue;
- relentry = get_rel_sync_entry(data, relid);
+ relentry = get_rel_sync_entry(data, relation);
if (!relentry->pubactions.pubtruncate)
continue;
@@ -1116,9 +1480,10 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
* when publishing.
*/
static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation relation)
{
RelationSyncEntry *entry;
+ Oid relid = RelationGetRelid(relation);
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
bool found;
@@ -1139,8 +1504,13 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->schema_sent = false;
entry->streamed_txns = NIL;
entry->replicate_valid = false;
+ entry->exprstate_valid = false;
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
+ entry->scantuple = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
+ entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
entry->publish_as_relid = InvalidOid;
entry->map = NULL; /* will be set by maybe_send_schema() if
* needed */
@@ -1245,9 +1615,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
}
- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
}
list_free(pubids);
@@ -1310,6 +1677,7 @@ static void
rel_sync_cache_relation_cb(Datum arg, Oid relid)
{
RelationSyncEntry *entry;
+ int idx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1354,6 +1722,25 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid)
free_conversion_map(entry->map);
}
entry->map = NULL;
+
+ /*
+ * Row filter cache cleanups. (Will be rebuilt later if needed).
+ */
+ entry->exprstate_valid = false;
+ if (entry->scantuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(entry->scantuple);
+ entry->scantuple = NULL;
+ }
+ /* Cleanup the ExprState for each of the pubactions. */
+ for (idx = 0; idx < IDX_PUBACTION_n; idx++)
+ {
+ if (entry->exprstate[idx] != NULL)
+ {
+ pfree(entry->exprstate[idx]);
+ entry->exprstate[idx] = NULL;
+ }
+ }
}
}
@@ -1365,6 +1752,7 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
{
HASH_SEQ_STATUS status;
RelationSyncEntry *entry;
+ MemoryContext oldctx;
/*
* We can get here if the plugin was used in SQL interface as the
@@ -1374,6 +1762,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
if (RelationSyncCache == NULL)
return;
+ oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
/*
* There is no way to find which entry in our cache the hash belongs to so
* mark the whole cache as invalid.
@@ -1392,6 +1782,8 @@ rel_sync_cache_publication_cb(Datum arg, int cacheid, uint32 hashvalue)
entry->pubactions.pubdelete = false;
entry->pubactions.pubtruncate = false;
}
+
+ MemoryContextSwitchTo(oldctx);
}
/* Send Replication origin */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c28788e..929b2f5 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2868,17 +2868,21 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
" JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
"WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
"UNION\n"
"SELECT pubname\n"
+ " , pg_get_expr(pr.prqual, c.oid)\n"
"FROM pg_catalog.pg_publication p\n"
" JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+ " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
"WHERE pr.prrelid = '%s'\n"
"UNION\n"
"SELECT pubname\n"
+ " , NULL\n"
"FROM pg_catalog.pg_publication p\n"
"WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
"ORDER BY 1;",
@@ -2914,6 +2918,13 @@ describeOneTableDetails(const char *schemaname,
printfPQExpBuffer(&buf, " \"%s\"",
PQgetvalue(result, i, 0));
+ /* row filter (if any) */
+ if (pset.sversion >= 150000)
+ {
+ if (!PQgetisnull(result, i, 1))
+ appendPQExpBuffer(&buf, " WHERE %s", PQgetvalue(result, i, 1));
+ }
+
printTableAddFooter(&cont, buf.data);
}
PQclear(result);
@@ -5833,8 +5844,12 @@ addFooterToPublicationDesc(PQExpBuffer buf, char *footermsg,
for (i = 0; i < count; i++)
{
if (!singlecol)
+ {
printfPQExpBuffer(buf, " \"%s.%s\"", PQgetvalue(res, i, 0),
PQgetvalue(res, i, 1));
+ if (!PQgetisnull(res, i, 2))
+ appendPQExpBuffer(buf, " WHERE %s", PQgetvalue(res, i, 2));
+ }
else
printfPQExpBuffer(buf, " \"%s\"", PQgetvalue(res, i, 0));
@@ -5963,8 +5978,15 @@ describePublications(const char *pattern)
{
/* Get the tables for the specified publication */
printfPQExpBuffer(&buf,
- "SELECT n.nspname, c.relname\n"
- "FROM pg_catalog.pg_class c,\n"
+ "SELECT n.nspname, c.relname");
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf,
+ ", pg_get_expr(pr.prqual, c.oid)");
+ else
+ appendPQExpBufferStr(&buf,
+ ", NULL");
+ appendPQExpBuffer(&buf,
+ "\nFROM pg_catalog.pg_class c,\n"
" pg_catalog.pg_namespace n,\n"
" pg_catalog.pg_publication_rel pr\n"
"WHERE c.relnamespace = n.oid\n"
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 902f2f2..96c55f6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -20,6 +20,7 @@
#include "catalog/genbki.h"
#include "catalog/objectaddress.h"
#include "catalog/pg_publication_d.h"
+#include "parser/parse_node.h"
/* ----------------
* pg_publication definition. cpp turns this into
@@ -86,6 +87,7 @@ typedef struct Publication
typedef struct PublicationRelInfo
{
Relation relation;
+ Node *whereClause;
} PublicationRelInfo;
extern Publication *GetPublication(Oid pubid);
@@ -123,13 +125,16 @@ extern List *GetPubPartitionOptionRelations(List *result,
extern bool is_publishable_relation(Relation rel);
extern bool is_schema_publication(Oid pubid);
-extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+extern ObjectAddress publication_add_relation(Oid pubid, PublicationRelInfo *pri,
bool if_not_exists);
extern ObjectAddress publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists);
extern Oid get_publication_oid(const char *pubname, bool missing_ok);
extern char *get_publication_name(Oid pubid, bool missing_ok);
+extern Node *GetTransformedWhereClause(ParseState *pstate,
+ PublicationRelInfo *pri,
+ bool bfixupcollation);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index b5d5504..154bb61 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,10 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
Oid oid; /* oid */
Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ pg_node_tree prqual; /* qualifications */
+#endif
} FormData_pg_publication_rel;
/* ----------------
@@ -40,6 +44,8 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
*/
typedef FormData_pg_publication_rel *Form_pg_publication_rel;
+DECLARE_TOAST(pg_publication_rel, 8287, 8288);
+
DECLARE_UNIQUE_INDEX_PKEY(pg_publication_rel_oid_index, 6112, PublicationRelObjectIndexId, on pg_publication_rel using btree(oid oid_ops));
DECLARE_UNIQUE_INDEX(pg_publication_rel_prrelid_prpubid_index, 6113, PublicationRelPrrelidPrpubidIndexId, on pg_publication_rel using btree(prrelid oid_ops, prpubid oid_ops));
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4c5a8a3..e437a55 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3642,6 +3642,7 @@ typedef struct PublicationTable
{
NodeTag type;
RangeVar *relation; /* relation to be published */
+ Node *whereClause; /* qualifications */
} PublicationTable;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index ee17908..d58ae6a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,6 +80,7 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
+ EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5ac2d66..5a49003 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -239,6 +239,157 @@ ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
+
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+ Publication testpub5
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
+
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+ Table "public.testpub_rf_tbl1"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a | integer | | | | plain | |
+ b | text | | | | extended | |
+Publications:
+ "testpub_dplus_rf_no"
+ "testpub_dplus_rf_yes" WHERE (a > 1)
+
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+ Publication testpub_syntax1
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "public.testpub_rf_tbl3" WHERE (e < 999)
+
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+ Publication testpub_syntax2
+ Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f | t | t | t | t | f
+Tables:
+ "public.testpub_rf_tbl1"
+ "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+ERROR: syntax error at or near "WHERE"
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+ ^
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+ERROR: WHERE clause for schema not allowed
+LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+ ^
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
+RESET client_min_messages;
+-- fail - aggregate functions not allowed in WHERE clause
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...TION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+ ^
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+ERROR: functions are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
+ ^
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ERROR: user-defined operators are not allowed in publication WHERE expressions
+LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+ ^
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
+ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tbl6" to publication
+DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+RESET client_min_messages;
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 56dd358..47bdba8 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -134,6 +134,82 @@ UPDATE testpub_parted2 SET a = 2;
DROP TABLE testpub_parted1, testpub_parted2;
DROP PUBLICATION testpub_forparted, testpub_forparted1;
+CREATE TABLE testpub_rf_tbl1 (a integer, b text);
+CREATE TABLE testpub_rf_tbl2 (c text, d integer);
+CREATE TABLE testpub_rf_tbl3 (e integer);
+CREATE TABLE testpub_rf_tbl4 (g text);
+CREATE SCHEMA testpub_rf_myschema;
+CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
+CREATE SCHEMA testpub_rf_myschema1;
+CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
+\dRp+ testpub5
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
+\dRp+ testpub5
+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE expression)
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500);
+\dRp+ testpub5
+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
+-- some more syntax tests to exercise other parser pathways
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax1
+DROP PUBLICATION testpub_syntax1;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;
+-- fail - schemas are not allowed WHERE row-filter
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+RESET client_min_messages;
+-- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
+RESET client_min_messages;
+-- fail - aggregate functions not allowed in WHERE clause
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
+-- fail - functions disallowed
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined operators disallowed
+CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
+RESET client_min_messages;
+
+DROP TABLE testpub_rf_tbl1;
+DROP TABLE testpub_rf_tbl2;
+DROP TABLE testpub_rf_tbl3;
+DROP TABLE testpub_rf_tbl4;
+DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
+DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_myschema;
+DROP SCHEMA testpub_rf_myschema1;
+DROP PUBLICATION testpub5;
+DROP PUBLICATION testpub7;
+DROP OPERATOR =#>(integer, integer);
+DROP FUNCTION testpub_rf_func(integer, integer);
+
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
new file mode 100644
index 0000000..64e71d0
--- /dev/null
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -0,0 +1,357 @@
+# Test logical replication behavior with row filtering
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+# setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_2 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_4 (c int primary key)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_partitioned (a int primary key, b integer) PARTITION BY RANGE(a)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_less_10k (LIKE tab_rowfilter_partitioned)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_less_10k FOR VALUES FROM (MINVALUE) TO (10000)"
+);
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_rowfilter_greater_10k (LIKE tab_rowfilter_partitioned)"
+);
+$node_subscriber->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_partitioned ATTACH PARTITION tab_rowfilter_greater_10k FOR VALUES FROM (10000) TO (MAXVALUE)"
+);
+
+# setup logical replication
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered')"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 ADD TABLE tab_rowfilter_2 WHERE (c % 7 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_1 SET TABLE tab_rowfilter_1 WHERE (a > 1000 AND b <> 'filtered'), tab_rowfilter_2 WHERE (c % 2 = 0), tab_rowfilter_3"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_2 FOR TABLE tab_rowfilter_2 WHERE (c % 3 = 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_3 FOR TABLE tab_rowfilter_partitioned WHERE (a < 5000)"
+);
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 ADD TABLE tab_rowfilter_less_10k WHERE (a < 6000)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_not_used FOR TABLE tab_rowfilter_1 WHERE (a < 0)"
+);
+
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4a FOR TABLE tab_rowfilter_4 WHERE (c % 2 = 0)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_4b FOR TABLE tab_rowfilter_4"
+);
+
+#
+# The following INSERTs are executed before the CREATE SUBSCRIPTION, so these
+# SQL commands are for testing the initial data copy using logical replication.
+#
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1, 'not replicated')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1500, 'filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1980, 'not filtered')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) SELECT x, 'test ' || x FROM generate_series(990,1002) x"
+);
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) SELECT generate_series(1, 20)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_3 (a, b) SELECT x, (x % 3 = 0) FROM generate_series(1, 10) x");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) SELECT generate_series(1, 10)");
+
+# insert data into partitioned table and directly on the partition
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(1, 100),(7000, 101),(15000, 102),(5500, 300)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(2, 200),(6005, 201)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# wait for initial table synchronization to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+# - INSERT (1, 'not replicated') NO, because a is not > 1000
+# - INSERT (1500, 'filtered') NO, because b == 'filtered'
+# - INSERT (1980, 'not filtered') YES
+# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
+#
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is( $result, qq(1001|test 1001
+1002|test 1002
+1980|not filtered), 'check initial data copy from table tab_rowfilter_1');
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3 (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(13|2|20), 'check initial data copy from table tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(10|1|10), 'check initial data copy from table tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_3
+# There is no filter. 10 rows are inserted, so 10 rows are replicated.
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_rowfilter_3");
+is($result, qq(10), 'check initial data copy from table tab_rowfilter_3');
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is false so use the row filter
+# from a partition
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# INSERT into tab_rowfilter_partitioned:
+# - INSERT (1,100) YES, because 1 < 6000
+# - INSERT (7000, 101) NO, because 7000 is not < 6000
+# - INSERT (15000, 102) YES, because tab_rowfilter_greater_10k has no filter
+# - INSERT (5500, 300) YES, because 5500 < 6000
+#
+# INSERT directly into tab_rowfilter_less_10k:
+# - INSERT (2, 200) YES, because 2 < 6000
+# - INSERT (6005, 201) NO, because 6005 is not < 6000
+#
+# INSERT directly into tab_rowfilter_greater_10k:
+# - INSERT (16000, 103) YES, because tab_rowfilter_greater_10k has no filter
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_less_10k ORDER BY 1, 2");
+is($result, qq(1|100
+2|200
+5500|300), 'check initial data copy from partition tab_rowfilter_less_10k');
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_greater_10k ORDER BY 1, 2");
+is($result, qq(15000|102
+16000|103), 'check initial data copy from partition tab_rowfilter_greater_10k');
+
+# The following commands are executed after CREATE SUBSCRIPTION, so these SQL
+# commands are for testing normal logical replication behavior.
+#
+# test row filter (INSERT, UPDATE, DELETE)
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (800, 'test 800')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1600, 'test 1600')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1601, 'test 1601')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_1 (a, b) VALUES (1700, 'test 1700')");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = NULL WHERE a = 1600");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_1 SET b = 'test 1601 updated' WHERE a = 1601");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_1 WHERE a = 1700");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_2 (c) VALUES (21), (22), (23), (24), (25)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_4 (c) VALUES (0), (11), (12)");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for tab_rowfilter_2
+# tap_pub_1 filter is: (c % 2 = 0)
+# tap_pub_2 filter is: (c % 3 = 0)
+# When there are multiple publications for the same table, the filters
+# expressions are OR'ed together. In this case, rows are replicated if
+# c value is divided by 2 OR 3.
+#
+# Expect original rows (2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20)
+# Plus (21, 22, 24)
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_2");
+is($result, qq(16|2|24), 'check replicated rows to tab_rowfilter_2');
+
+# Check expected replicated rows for tab_rowfilter_4
+# (same table in two publications but only one has a filter).
+# tap_pub_4a filter is: (c % 2 = 0)
+# tap_pub_4b filter is: <no filter>
+# Expressions are OR'ed together but when there is no filter it just means
+# OR everything - e.g. same as no filter at all.
+# Expect all rows from initial copy: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+# And also (0, 11, 12)
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(c), min(c), max(c) FROM tab_rowfilter_4");
+is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
+
+# Check expected replicated rows for tab_rowfilter_1
+# tap_pub_1 filter is: (a > 1000 AND b <> 'filtered')
+#
+# - 1001, 1002, 1980 already exist from initial data copy
+# - INSERT (800, 'test 800') NO, because 800 is not > 1000
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
+# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
+# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
+# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
+# - DELETE (1700) NO, row filter contains column b that is not part of
+# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
+# evaluates to false
+#
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
+is($result, qq(1001|test 1001
+1002|test 1002
+1600|test 1600
+1601|test 1601 updated
+1700|test 1700
+1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
+
+# Publish using root partitioned table
+# Use a different partitioned table layout (exercise publish_via_partition_root)
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_3 SET (publish_via_partition_root = true)");
+$node_subscriber->safe_psql('postgres',
+ "TRUNCATE TABLE tab_rowfilter_partitioned");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION WITH (copy_data = true)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_partitioned (a, b) VALUES(4000, 400),(4001, 401),(4002, 402)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(4500, 450)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_less_10k (a, b) VALUES(5600, 123)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(14000, 1950)");
+$node_publisher->safe_psql('postgres',
+ "UPDATE tab_rowfilter_less_10k SET b = 30 WHERE a = 4001");
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM tab_rowfilter_less_10k WHERE a = 4002");
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check expected replicated rows for partitions
+# publication option publish_via_partition_root is true so use the row filter
+# from the root partitioned table
+# tab_rowfilter_partitioned filter: (a < 5000)
+# tab_rowfilter_less_10k filter: (a < 6000)
+# tab_rowfilter_greater_10k filter: no filter
+#
+# After TRUNCATE, REFRESH PUBLICATION, the initial data copy will apply the
+# partitioned table row filter.
+# - INSERT (1, 100) YES, 1 < 5000
+# - INSERT (7000, 101) NO, 7000 is not < 5000
+# - INSERT (15000, 102) NO, 15000 is not < 5000
+# - INSERT (5500, 300) NO, 5500 is not < 5000
+# - INSERT (2, 200) YES, 2 < 5000
+# - INSERT (6005, 201) NO, 6005 is not < 5000
+# - INSERT (16000, 103) NO, 16000 is not < 5000
+#
+# Execute SQL commands after initial data copy for testing the logical
+# replication behavior.
+# - INSERT (4000, 400) YES, 4000 < 5000
+# - INSERT (4001, 401) YES, 4001 < 5000
+# - INSERT (4002, 402) YES, 4002 < 5000
+# - INSERT (4500, 450) YES, 4500 < 5000
+# - INSERT (5600, 123) NO, 5600 is not < 5000
+# - INSERT (14000, 1950) NO, 16000 is not < 5000
+# - UPDATE (4001) YES, 4001 < 5000
+# - DELETE (4002) YES, 4002 < 5000
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT a, b FROM tab_rowfilter_partitioned ORDER BY 1, 2");
+is( $result, qq(1|100
+2|200
+4000|400
+4001|30
+4500|450), 'check publish_via_partition_root behavior');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
--
1.8.3.1
v49-0003-Row-filter-updates-based-on-old-new-tuples.patchapplication/octet-stream; name=v49-0003-Row-filter-updates-based-on-old-new-tuples.patchDownload
From 3db17a7a7386fd9a2f4ed33a9c5dacc68446e4d3 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:39:04 +1100
Subject: [PATCH v49] Row-filter updates based on old/new tuples
When applying row filter on updates, check both old_tuple and new_tuple
to decide how an update needs to be transformed.
UPDATE
old-row (match) new-row (no match) -> DELETE
old-row (no match) new row (match) -> INSERT
old-row (match) new row (match) -> UPDATE
old-row (no match) new-row (no match) -> (drop change)
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
Author: Ajin Cherian
---
src/backend/replication/logical/proto.c | 38 +++--
src/backend/replication/pgoutput/pgoutput.c | 228 ++++++++++++++++++++++++----
src/include/replication/logicalproto.h | 7 +-
src/include/replication/reorderbuffer.h | 6 +-
src/test/subscription/t/027_row_filter.pl | 4 +-
src/tools/pgindent/typedefs.list | 1 +
6 files changed, 235 insertions(+), 49 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9f5bf4b..110ccff 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -19,6 +19,7 @@
#include "replication/logicalproto.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+#include "executor/executor.h"
/*
* Protocol message flags.
@@ -31,8 +32,8 @@
static void logicalrep_write_attrs(StringInfo out, Relation rel);
static void logicalrep_write_tuple(StringInfo out, Relation rel,
- HeapTuple tuple, bool binary);
-
+ HeapTuple tuple, TupleTableSlot *slot,
+ bool binary);
static void logicalrep_read_attrs(StringInfo in, LogicalRepRelation *rel);
static void logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple);
@@ -410,7 +411,7 @@ logicalrep_write_insert(StringInfo out, TransactionId xid, Relation rel,
pq_sendint32(out, RelationGetRelid(rel));
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, NULL, binary);
}
/*
@@ -442,7 +443,8 @@ logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
*/
void
logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
- HeapTuple oldtuple, HeapTuple newtuple, bool binary)
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary)
{
pq_sendbyte(out, LOGICAL_REP_MSG_UPDATE);
@@ -463,11 +465,11 @@ logicalrep_write_update(StringInfo out, TransactionId xid, Relation rel,
pq_sendbyte(out, 'O'); /* old tuple follows */
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, oldslot, binary);
}
pq_sendbyte(out, 'N'); /* new tuple follows */
- logicalrep_write_tuple(out, rel, newtuple, binary);
+ logicalrep_write_tuple(out, rel, newtuple, newslot, binary);
}
/*
@@ -536,7 +538,7 @@ logicalrep_write_delete(StringInfo out, TransactionId xid, Relation rel,
else
pq_sendbyte(out, 'K'); /* old key follows */
- logicalrep_write_tuple(out, rel, oldtuple, binary);
+ logicalrep_write_tuple(out, rel, oldtuple, NULL, binary);
}
/*
@@ -749,13 +751,16 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
* Write a tuple to the outputstream, in the most efficient format possible.
*/
static void
-logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binary)
+logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, TupleTableSlot *slot,
+bool binary)
{
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
+ Datum attr_values[MaxTupleAttributeNumber];
+ bool attr_isnull[MaxTupleAttributeNumber];
desc = RelationGetDescr(rel);
@@ -771,7 +776,17 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
enlargeStringInfo(out, tuple->t_len +
nliveatts * (1 + 4));
- heap_deform_tuple(tuple, desc, values, isnull);
+ if (TupIsNull(slot))
+ {
+ values = attr_values;
+ isnull = attr_isnull;
+ heap_deform_tuple(tuple, desc, values, isnull);
+ }
+ else
+ {
+ values = slot->tts_values;
+ isnull = slot->tts_isnull;
+ }
/* Write the values */
for (i = 0; i < desc->natts; i++)
@@ -832,6 +847,7 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar
ReleaseSysCache(typtup);
}
+
}
/*
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 2fa08e7..8a733cb 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include "access/tupconvert.h"
+#include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_publication.h"
#include "catalog/pg_publication_rel.h"
@@ -25,6 +26,7 @@
#include "parser/parse_coerce.h"
#include "replication/logical.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "replication/origin.h"
#include "replication/pgoutput.h"
#include "utils/builtins.h"
@@ -140,6 +142,9 @@ typedef struct RelationSyncEntry
ExprState *exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
One per publication action. */
TupleTableSlot *scantuple; /* tuple table slot for row filter */
+ TupleTableSlot *new_tuple; /* slot for storing deformed new tuple during updates */
+ TupleTableSlot *old_tuple; /* slot for storing deformed old tuple during updates */
+ TupleTableSlot *tmp_new_tuple; /* slot for temporary new tuple used for expression evaluation */
/*
* OID of the relation to publish changes as. For a partition, this may
@@ -174,11 +179,15 @@ static bool get_schema_sent_in_streamed_txn(RelationSyncEntry *entry,
/* row filter routines */
static EState *create_estate_for_relation(Relation rel);
+static void pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry);
static ExprState *pgoutput_row_filter_init_expr(Node *rfnode);
static bool pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext);
-static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple,
- HeapTuple newtuple, RelationSyncEntry *entry);
+static bool pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ TupleTableSlot *slot, RelationSyncEntry *entry);
+static bool pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action);
/*
* Specify output plugin callbacks
@@ -742,26 +751,124 @@ pgoutput_row_filter_exec_expr(ExprState *state, ExprContext *econtext)
}
/*
- * Change is checked against the row filter, if any.
- *
- * If it returns true, the change is replicated, otherwise, it is not.
+ * Update is checked against the row filter, if any.
+ * Updates are transformed to inserts and deletes based on the
+ * old_tuple and new_tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ * old-row (no match) new-row (no match) -> (drop change)
+ * old-row (no match) new row (match) -> INSERT
+ * old-row (match) new-row (no match) -> DELETE
+ * old-row (match) new row (match) -> UPDATE
+ * If the change is to be replicated returns true, else false.
*/
static bool
-pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
- Relation relation, HeapTuple oldtuple, HeapTuple newtuple,
- RelationSyncEntry *entry)
+pgoutput_row_filter_update_check(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ RelationSyncEntry *entry, ReorderBufferChangeType *action)
{
- EState *estate;
- ExprContext *ecxt;
- ListCell *lc;
- bool result = true;
- Oid relid = RelationGetRelid(relation);
- List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
- bool no_filter[] = {false, false, false}; /* One per pubaction */
+ TupleDesc desc = RelationGetDescr(relation);
+ int i;
+ bool old_matched, new_matched;
+ TupleTableSlot *tmp_new_slot, *old_slot, *new_slot;
Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
changetype == REORDER_BUFFER_CHANGE_UPDATE ||
changetype == REORDER_BUFFER_CHANGE_DELETE);
+
+ /* Bail out if there is no row filter */
+ if (!entry->exprstate[changetype])
+ return true;
+
+ /* update requires a new tuple */
+ Assert(newtuple);
+
+ elog(DEBUG3, "table \"%s.%s\" has row filter",
+ get_namespace_name(get_rel_namespace(RelationGetRelid(relation))),
+ get_rel_name(relation->rd_id));
+
+ /* Clear the tuples */
+ ExecClearTuple(entry->old_tuple);
+ ExecClearTuple(entry->new_tuple);
+
+ /*
+ * If no old_tuple, then none of the replica identity columns changed
+ * and this would reduce to a simple update.
+ */
+ if (!oldtuple)
+ {
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+ return pgoutput_row_filter(changetype, relation, NULL, newtuple, NULL, entry);
+ }
+
+ old_slot = entry->old_tuple;
+ new_slot = entry->new_tuple;
+ tmp_new_slot = entry->tmp_new_tuple;
+
+ ExecClearTuple(tmp_new_slot);
+
+ heap_deform_tuple(newtuple, desc, new_slot->tts_values, new_slot->tts_isnull);
+ heap_deform_tuple(oldtuple, desc, old_slot->tts_values, old_slot->tts_isnull);
+
+ ExecStoreVirtualTuple(old_slot);
+ ExecStoreVirtualTuple(new_slot);
+
+ tmp_new_slot = ExecCopySlot(tmp_new_slot, new_slot);
+
+ /*
+ * For updates, both the newtuple and oldtuple needs to be checked
+ * against the row-filter. The newtuple might not have all the
+ * replica identity columns, in which case it needs to be
+ * copied over from the oldtuple.
+ */
+ for (i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* if the column in the new_tuple is null, nothing to do */
+ if (tmp_new_slot->tts_isnull[i])
+ continue;
+
+ /*
+ * Unchanged toasted replica identity columns are
+ * only detoasted in the old tuple, copy this over to the newtuple.
+ */
+ if ((att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
+ (!old_slot->tts_isnull[i] &&
+ !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
+ {
+ Assert(!old_slot->tts_isnull[i] &&
+ !VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]));
+ tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
+ }
+
+ }
+
+ old_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, old_slot, entry);
+ new_matched = pgoutput_row_filter(changetype, relation, NULL, NULL, tmp_new_slot, entry);
+
+ if (!old_matched && !new_matched)
+ return false;
+
+ if (!old_matched && new_matched)
+ *action = REORDER_BUFFER_CHANGE_INSERT;
+ else if (old_matched && !new_matched)
+ *action = REORDER_BUFFER_CHANGE_DELETE;
+ else if (new_matched && old_matched)
+ *action = REORDER_BUFFER_CHANGE_UPDATE;
+
+ return true;
+}
+
+/*
+ * Initialize the row filter, the first time.
+ */
+static void
+pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntry *entry)
+{
+ ListCell *lc;
+ List *rfnodes[] = {NIL, NIL, NIL}; /* One per pubaction */
+ bool no_filter[] = {false, false, false}; /* One per pubaction */
+
/*
* If the row filter caching is currently flagged "invalid" then it means we
* don't know yet if there is/isn't any row filters for this relation.
@@ -921,11 +1028,34 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
tupdesc = CreateTupleDescCopy(tupdesc);
entry->scantuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsHeapTuple);
+ entry->old_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+ entry->tmp_new_tuple = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
MemoryContextSwitchTo(oldctx);
}
entry->exprstate_valid = true;
}
+}
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ */
+static bool
+pgoutput_row_filter(enum ReorderBufferChangeType changetype, Relation relation,
+ HeapTuple oldtuple, HeapTuple newtuple, TupleTableSlot *slot,
+ RelationSyncEntry *entry)
+{
+ EState *estate;
+ ExprContext *ecxt;
+ bool result = true;
+ Oid relid = RelationGetRelid(relation);
+
+ Assert(changetype == REORDER_BUFFER_CHANGE_INSERT ||
+ changetype == REORDER_BUFFER_CHANGE_UPDATE ||
+ changetype == REORDER_BUFFER_CHANGE_DELETE);
/* Bail out if there is no row filter */
if (!entry->exprstate[changetype])
@@ -944,7 +1074,12 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
ecxt = GetPerTupleExprContext(estate);
ecxt->ecxt_scantuple = entry->scantuple;
- ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple, ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }
/*
* NOTE: Multiple publication row-filters have already been combined to a
@@ -957,7 +1092,6 @@ pgoutput_row_filter(enum ReorderBufferChangeType changetype, PGOutputData *data,
}
/* Cleanup allocated resources */
- ResetExprContext(ecxt);
FreeExecutorState(estate);
PopActiveSnapshot();
@@ -1015,6 +1149,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
/* Avoid leaking memory by using and resetting our own context */
old = MemoryContextSwitchTo(data->context);
+ /* Initialize the row_filter */
+ pgoutput_row_filter_init(data, relation, relentry);
+
/* Send the data */
switch (change->action)
{
@@ -1022,10 +1159,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple tuple = &change->data.tp.newtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, NULL, tuple, relentry))
- break;
-
/*
* Schema should be sent before the logic that replaces the
* relation because it also sends the ancestor's relation.
@@ -1043,6 +1176,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
tuple = execute_attr_map_tuple(tuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, NULL, tuple,
+ NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_insert(ctx->out, xid, relation, tuple,
data->binary);
@@ -1054,10 +1192,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
HeapTuple oldtuple = change->data.tp.oldtuple ?
&change->data.tp.oldtuple->tuple : NULL;
HeapTuple newtuple = &change->data.tp.newtuple->tuple;
-
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, newtuple, relentry))
- break;
+ ReorderBufferChangeType modified_action = REORDER_BUFFER_CHANGE_UPDATE;
maybe_send_schema(ctx, change, relation, relentry);
@@ -1078,9 +1213,34 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
}
}
+ /* Check row filter */
+ if (!pgoutput_row_filter_update_check(change->action, relation,
+ oldtuple, newtuple, relentry,
+ &modified_action))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
- logicalrep_write_update(ctx->out, xid, relation, oldtuple,
- newtuple, data->binary);
+
+ switch (modified_action)
+ {
+ case REORDER_BUFFER_CHANGE_INSERT:
+ logicalrep_write_insert(ctx->out, xid, relation, newtuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ logicalrep_write_update(ctx->out, xid, relation,
+ oldtuple, newtuple, relentry->old_tuple,
+ relentry->new_tuple,
+ data->binary);
+ break;
+ case REORDER_BUFFER_CHANGE_DELETE:
+ logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
+ data->binary);
+ break;
+ default:
+ Assert(false);
+ }
+
OutputPluginWrite(ctx, true);
break;
}
@@ -1089,10 +1249,6 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
{
HeapTuple oldtuple = &change->data.tp.oldtuple->tuple;
- /* Check row filter. */
- if (!pgoutput_row_filter(change->action, data, relation, oldtuple, NULL, relentry))
- break;
-
maybe_send_schema(ctx, change, relation, relentry);
/* Switch relation if publishing via root. */
@@ -1106,6 +1262,11 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
}
+ /* Check row filter. */
+ if (!pgoutput_row_filter(change->action, relation, oldtuple,
+ NULL, NULL, relentry))
+ break;
+
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_delete(ctx->out, xid, relation, oldtuple,
data->binary);
@@ -1508,6 +1669,9 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
entry->pubactions.pubinsert = entry->pubactions.pubupdate =
entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
entry->scantuple = NULL;
+ entry->new_tuple = NULL;
+ entry->old_tuple = NULL;
+ entry->tmp_new_tuple = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_INSERT] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_UPDATE] = NULL;
entry->exprstate[REORDER_BUFFER_CHANGE_DELETE] = NULL;
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 83741dc..427c40a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -16,6 +16,7 @@
#include "access/xact.h"
#include "replication/reorderbuffer.h"
#include "utils/rel.h"
+#include "executor/executor.h"
/*
* Protocol capabilities
@@ -211,7 +212,11 @@ extern void logicalrep_write_insert(StringInfo out, TransactionId xid,
extern LogicalRepRelId logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup);
extern void logicalrep_write_update(StringInfo out, TransactionId xid,
Relation rel, HeapTuple oldtuple,
- HeapTuple newtuple, bool binary);
+ HeapTuple newtuple, TupleTableSlot *oldslot,
+ TupleTableSlot *newslot, bool binary);
+extern void logicalrep_write_update_cached(StringInfo out, TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);
extern LogicalRepRelId logicalrep_read_update(StringInfo in,
bool *has_oldtuple, LogicalRepTupleData *oldtup,
LogicalRepTupleData *newtup);
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 5b40ff7..aec0059 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
* respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
* logical decoding don't have to care about these.
*/
-enum ReorderBufferChangeType
+typedef enum ReorderBufferChangeType
{
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_UPDATE,
@@ -65,7 +65,7 @@ enum ReorderBufferChangeType
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
REORDER_BUFFER_CHANGE_TRUNCATE
-};
+} ReorderBufferChangeType;
/* forward declaration */
struct ReorderBufferTXN;
@@ -82,7 +82,7 @@ typedef struct ReorderBufferChange
XLogRecPtr lsn;
/* The type of change. */
- enum ReorderBufferChangeType action;
+ ReorderBufferChangeType action;
/* Transaction this change belongs to. */
struct ReorderBufferTXN *txn;
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index de6b73d..a2f25f6 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -277,7 +277,8 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
#
# - 1001, 1002, 1980 already exist from initial data copy
# - INSERT (800, 'test 800') NO, because 800 is not > 1000
-# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered'
+# - INSERT (1600, 'test 1600') YES, because 1600 > 1000 and 'test 1600' <> 'filtered',
+# but row deleted after the update below.
# - INSERT (1601, 'test 1601') YES, because 1601 > 1000 and 'test 1601' <> 'filtered'
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
@@ -289,7 +290,6 @@ $result =
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is($result, qq(1001|test 1001
1002|test 1002
-1600|test 1600
1601|test 1601 updated
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 89f3917..a9a1d0d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2200,6 +2200,7 @@ ReorderBufferApplyChangeCB
ReorderBufferApplyTruncateCB
ReorderBufferBeginCB
ReorderBufferChange
+ReorderBufferChangeType
ReorderBufferCommitCB
ReorderBufferCommitPreparedCB
ReorderBufferDiskChange
--
1.8.3.1
v49-0004-Row-filter-tab-auto-complete-and-pgdump.patchapplication/octet-stream; name=v49-0004-Row-filter-tab-auto-complete-and-pgdump.patchDownload
From 51f8f90208c3ad255af7844cd547c88c2014f43e Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:40:15 +1100
Subject: [PATCH v49] Row-filter tab auto-complete and pgdump
tab-auto-complete
-----------------
e.g.
"CREATE PUBLICATION <name> FOR TABLE <name>" - complete with "WHERE (".
"ALTER PUBLICATION <name> ADD|SET TABLE <name>" - complete with "WHERE (".
Author: Peter Smith
pg_dump
-------
Author: Euler Taveira
---
src/bin/pg_dump/pg_dump.c | 24 ++++++++++++++++++++----
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/tab-complete.c | 24 ++++++++++++++++++++++--
3 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 784771c..4acae2a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4034,6 +4034,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
int i_oid;
int i_prpubid;
int i_prrelid;
+ int i_prrelqual;
int i,
j,
ntups;
@@ -4044,9 +4045,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
query = createPQExpBuffer();
/* Collect all publication membership info. */
- appendPQExpBufferStr(query,
- "SELECT tableoid, oid, prpubid, prrelid "
- "FROM pg_catalog.pg_publication_rel");
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
+ else
+ appendPQExpBufferStr(query,
+ "SELECT tableoid, oid, prpubid, prrelid, "
+ "NULL AS prrelqual "
+ "FROM pg_catalog.pg_publication_rel");
res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
ntups = PQntuples(res);
@@ -4055,6 +4063,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
i_oid = PQfnumber(res, "oid");
i_prpubid = PQfnumber(res, "prpubid");
i_prrelid = PQfnumber(res, "prrelid");
+ i_prrelqual = PQfnumber(res, "prrelqual");
/* this allocation may be more than we need */
pubrinfo = pg_malloc(ntups * sizeof(PublicationRelInfo));
@@ -4095,6 +4104,10 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
pubrinfo[j].dobj.name = tbinfo->dobj.name;
pubrinfo[j].publication = pubinfo;
pubrinfo[j].pubtable = tbinfo;
+ if (PQgetisnull(res, i, i_prrelqual))
+ pubrinfo[j].pubrelqual = NULL;
+ else
+ pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
/* Decide whether we want to dump it */
selectDumpablePublicationObject(&(pubrinfo[j].dobj), fout);
@@ -4162,8 +4175,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo)
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
- appendPQExpBuffer(query, " %s;\n",
+ appendPQExpBuffer(query, " %s",
fmtQualifiedDumpable(tbinfo));
+ if (pubrinfo->pubrelqual)
+ appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual);
+ appendPQExpBufferStr(query, ";\n");
/*
* There is no point in creating a drop query as the drop is done by table
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f011ace..0ebdce5 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -630,6 +630,7 @@ typedef struct _PublicationRelInfo
DumpableObject dobj;
PublicationInfo *publication;
TableInfo *pubtable;
+ char *pubrelqual;
} PublicationRelInfo;
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b524dc8..1d47634 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,6 +1654,19 @@ psql_completion(const char *text, int start, int end)
/* ALTER PUBLICATION <name> SET */
else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
+ /* ALTER PUBLICATION <name> SET TABLE <name> */
+ /* ALTER PUBLICATION <name> ADD TABLE <name> */
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (");
+ /*
+ * "ALTER PUBLICATION <name> SET TABLE <name> WHERE (" - complete with
+ * table attributes
+ *
+ * "ALTER PUBLICATION <name> ADD TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("ALTER", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", "ALL", "TABLES", "IN", "SCHEMA"))
COMPLETE_WITH_QUERY(Query_for_list_of_schemas
" AND nspname != 'pg_catalog' "
@@ -2777,13 +2790,20 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("TABLES", "TABLES IN SCHEMA");
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES"))
COMPLETE_WITH("IN SCHEMA", "WITH (");
- else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny))
- COMPLETE_WITH("WITH (");
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE", MatchAny) && !ends_with(prev_wd, ','))
+ COMPLETE_WITH("WHERE (", "WITH (");
/* Complete "CREATE PUBLICATION <name> FOR TABLE" with "<table>, ..." */
else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
/*
+ * "CREATE PUBLICATION <name> FOR TABLE <name> WHERE (" - complete with
+ * table attributes
+ */
+ else if (HeadMatches("CREATE", "PUBLICATION", MatchAny) && TailMatches("WHERE", "("))
+ COMPLETE_WITH_ATTR(prev3_wd, "");
+
+ /*
* Complete "CREATE PUBLICATION <name> FOR ALL TABLES IN SCHEMA <schema>,
* ..."
*/
--
1.8.3.1
v49-0005-Row-filter-handle-FOR-ALL-TABLES.patchapplication/octet-stream; name=v49-0005-Row-filter-handle-FOR-ALL-TABLES.patchDownload
From 8bbe95240ccdba84d533ca54508ded8ef3d60824 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:41:40 +1100
Subject: [PATCH v49] Row-filter handle FOR ALL TABLES
If one of the subscriber's publications was created using FOR ALL TABLES then
that implies NO row-filtering will be applied.
If one of the subscriber's publications was created using FOR ALL TABLES IN
SCHEMA and the table belong to that same schmea, then that also implies NO
row-filtering will be applied.
These rules overrides any other row-filters from other subscribed publications.
Note that the initial COPY does not take publication operations into account.
Author: Peter Smith
Reported By: Tang
Discussion: https://www.postgresql.org/message-id/OS0PR01MB6113D82113AA081ACF710D0CFB6E9%40OS0PR01MB6113.jpnprd01.prod.outlook.com
---
src/backend/replication/logical/tablesync.c | 48 +++++++----
src/backend/replication/pgoutput/pgoutput.c | 63 ++++++++++++++-
src/test/subscription/t/027_row_filter.pl | 118 ++++++++++++++++++++++++++--
3 files changed, 202 insertions(+), 27 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index c20c221..469aadc 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -802,21 +802,22 @@ fetch_remote_table_info(char *nspname, char *relname,
walrcv_clear_result(res);
/*
+ * If any publication has puballtables true then all row-filtering is
+ * ignored.
+ *
+ * If the relation is a member of a schema of a subscribed publication that
+ * said ALL TABLES IN SCHEMA then all row-filtering is ignored.
+ *
* Get relation qual. DISTINCT avoids the same expression of a table in
* multiple publications from being included multiple times in the final
* expression.
*/
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
{
- resetStringInfo(&cmd);
- appendStringInfo(&cmd,
- "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
- " FROM pg_publication p "
- " INNER JOIN pg_publication_rel pr "
- " ON (p.oid = pr.prpubid) "
- " WHERE pr.prrelid = %u "
- " AND p.pubname IN (", lrel->remoteid);
+ StringInfoData pub_names;
+ /* Build the pubname list. */
+ initStringInfo(&pub_names);
first = true;
foreach(lc, MySubscription->publications)
{
@@ -825,11 +826,28 @@ fetch_remote_table_info(char *nspname, char *relname,
if (first)
first = false;
else
- appendStringInfoString(&cmd, ", ");
+ appendStringInfoString(&pub_names, ", ");
- appendStringInfoString(&cmd, quote_literal_cstr(pubname));
+ appendStringInfoString(&pub_names, quote_literal_cstr(pubname));
}
- appendStringInfoChar(&cmd, ')');
+
+ /* Check for row-filters */
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid = %u AND p.pubname IN ( %s ) "
+ " AND NOT (select bool_or(puballtables) "
+ " FROM pg_publication "
+ " WHERE pubname in ( %s )) "
+ " AND (SELECT count(1)=0 "
+ " FROM pg_publication_namespace pn, pg_class c "
+ " WHERE c.oid = %u AND c.relnamespace = pn.pnnspid)",
+ lrel->remoteid,
+ pub_names.data,
+ pub_names.data,
+ lrel->remoteid);
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
@@ -847,18 +865,14 @@ fetch_remote_table_info(char *nspname, char *relname,
slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
{
- Datum rf = slot_getattr(slot, 1, &isnull);
+ Datum rf = slot_getattr(slot, 1, &isnull);
if (!isnull)
*qual = lappend(*qual, makeString(TextDatumGetCString(rf)));
ExecClearTuple(slot);
- /*
- * One entry without a row filter expression means clean up
- * previous expressions (if there are any) and return with no
- * expressions.
- */
+ /* Ignore filters and cleanup as necessary. */
if (isnull)
{
if (*qual)
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 8a733cb..43d0125 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -912,13 +912,68 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr
* relation. Since row filter usage depends on the DML operation,
* there are multiple lists (one for each operation) which row filters
* will be appended.
+ *
+ * NOTE: FOR ALL TABLES implies "use no filters" so it takes precedence
+ *
+ * NOTE: ALL TABLES IN SCHEMA also implies "use not filters" if the
+ * table is a member of the same schema.
*/
foreach(lc, data->publications)
{
- Publication *pub = lfirst(lc);
- HeapTuple rftuple;
- Datum rfdatum;
- bool rfisnull;
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple;
+ Datum rfdatum;
+ bool rfisnull;
+ List *schemarelids = NIL;
+
+ /*
+ * If the publication is FOR ALL TABLES then it is treated the same
+ * as if this table has no filters (even if for some other
+ * publication it does).
+ */
+ if (pub->alltables)
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;
+ }
+
+ /*
+ * If the publication is FOR ALL TABLES IN SCHEMA and it overlaps with the
+ * current relation in the same schema then this is also treated same as if
+ * this table has no filters (even if for some other publication it does).
+ */
+ schemarelids = GetAllSchemaPublicationRelations(pub->oid,
+ pub->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ if (list_member_oid(schemarelids, entry->relid))
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ list_free(schemarelids);
+
+ /* Quick exit loop if all pubactions have no row-filter. */
+ if (no_filter[idx_ins] && no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;
+ }
+ list_free(schemarelids);
/*
* Lookup if there is a row-filter, and if yes remember it in a list (per
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index a2f25f6..73add45 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 10;
+use Test::More tests => 14;
# create publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
@@ -15,6 +15,116 @@ my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init(allows_streaming => 'logical');
$node_subscriber->start;
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+
+# ====================================================================
+# Testcase start: FOR ALL TABLES
+#
+# The FOR ALL TABLES test must come first so that it is not affected by
+# all the other test tables that are later created.
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres', "CREATE TABLE tab_rf_x (x int primary key)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_rf_x (x int primary key)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rf_x (x) VALUES (0), (5), (10), (15), (20)");
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_x FOR TABLE tab_rf_x WHERE (x > 10)");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_forall FOR ALL TABLES");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_x, tap_pub_forall");
+
+$node_publisher->wait_for_catchup($appname);
+# wait for initial table synchronization to finish
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# The subscription of the FOR ALL TABLES publication means there should be no
+# filtering on the tablesync COPY, so all expect all 5 will be present.
+my $result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_forall");
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_x");
+$node_publisher->safe_psql('postgres', "DROP TABLE tab_rf_x");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
+$node_subscriber->safe_psql('postgres', "DROP TABLE tab_rf_x");
+
+# Testcase end: FOR ALL TABLES
+# ====================================================================
+
+# ====================================================================
+# Testcase start: ALL TABLES IN SCHEMA
+#
+# The ALL TABLES IN SCHEMA test is independent of all other test cases so it
+# cleans up after itself.
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres', "CREATE SCHEMA schema_rf_x");
+$node_publisher->safe_psql('postgres', "CREATE TABLE schema_rf_x.tab_rf_x (x int primary key)");
+$node_subscriber->safe_psql('postgres', "CREATE SCHEMA schema_rf_x");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE schema_rf_x.tab_rf_x (x int primary key)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO schema_rf_x.tab_rf_x (x) VALUES (0), (5), (10), (15), (20)");
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_x FOR TABLE schema_rf_x.tab_rf_x WHERE (x > 10)");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_allinschema FOR ALL TABLES IN SCHEMA schema_rf_x");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_x, tap_pub_allinschema");
+
+$node_publisher->wait_for_catchup($appname);
+# wait for initial table synchronization to finish
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+# The subscription of the ALL TABLES IN SCHEMA publication means there should be
+# no filtering on the tablesync COPY, so all expect all 5 will be present.
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM schema_rf_x.tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO schema_rf_x.tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x) FROM schema_rf_x.tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_allinschema");
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_x");
+$node_publisher->safe_psql('postgres', "DROP TABLE schema_rf_x.tab_rf_x");
+$node_publisher->safe_psql('postgres', "DROP SCHEMA schema_rf_x");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub");
+$node_subscriber->safe_psql('postgres', "DROP TABLE schema_rf_x.tab_rf_x");
+$node_subscriber->safe_psql('postgres', "DROP SCHEMA schema_rf_x");
+
+# Testcase end: ALL TABLES IN SCHEMA
+# ====================================================================
+
# setup structure on publisher
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
@@ -127,8 +237,6 @@ $node_publisher->safe_psql('postgres',
$node_publisher->safe_psql('postgres',
"INSERT INTO tab_rowfilter_greater_10k (a, b) VALUES(16000, 103)");
-my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
-my $appname = 'tap_sub';
$node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b"
);
@@ -136,8 +244,6 @@ $node_subscriber->safe_psql('postgres',
$node_publisher->wait_for_catchup($appname);
# wait for initial table synchronization to finish
-my $synced_query =
- "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
$node_subscriber->poll_query_until('postgres', $synced_query)
or die "Timed out while waiting for subscriber to synchronize data";
@@ -148,7 +254,7 @@ $node_subscriber->poll_query_until('postgres', $synced_query)
# - INSERT (1980, 'not filtered') YES
# - generate_series(990,1002) YES, only for 1001,1002 because a > 1000
#
-my $result =
+$result =
$node_subscriber->safe_psql('postgres',
"SELECT a, b FROM tab_rowfilter_1 ORDER BY 1, 2");
is( $result, qq(1001|test 1001
--
1.8.3.1
v49-0002-Row-filter-validation.patchapplication/octet-stream; name=v49-0002-Row-filter-validation.patchDownload
From 959c121173d1745009c4b13c8c5f5eee534a01b9 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Fri, 17 Dec 2021 20:37:32 +1100
Subject: [PATCH v49] Row-filter validation
This patch implements parse-tree "walkers" to validate a row-filter.
Expression Node-kind validation
-------------------------------
Only simple filter expressions are permitted. Specifically:
- no user-defined operators.
- no user-defined functions.
- no user-defined types.
- no system columns.
- no system functions (unless they are immutable). See design decision at [1].
[1] https://www.postgresql.org/message-id/CAA4eK1%2BXoD49bz5-2TtiD0ugq4PHSRX2D1sLPR_X4LNtdMc4OQ%40mail.gmail.com
Permits only simple nodes including:
List, Const, BoolExpr, NullIfExpr, NullTest, BooleanTest, CoalesceExpr,
CaseExpr, CaseTestExpr, MinMaxExpr, ArrayExpr, ScalarArrayOpExpr, XmlExpr.
Author: Peter Smith, Euler Taveira
REPLICA IDENTITY validation
---------------------------
For publish mode "delete" "update", validate that any columns referenced
in the filter expression must be part of REPLICA IDENTITY or Primary Key.
Row filter columns invalidation is done in CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on
the published relation. This is consistent with the existing check about
replica identity and can detect the change related to the row filter in time.
Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It is safe to do this because every
operation that change the row filter and replica identity will invalidate the
relcache.
Author: Hou zj
---
src/backend/catalog/pg_publication.c | 177 +++++++++++++++++++-
src/backend/executor/execReplication.c | 36 ++++-
src/backend/parser/parse_agg.c | 10 --
src/backend/parser/parse_expr.c | 21 +--
src/backend/parser/parse_func.c | 3 -
src/backend/parser/parse_oper.c | 7 -
src/backend/parser/parse_relation.c | 9 --
src/backend/replication/pgoutput/pgoutput.c | 29 ++--
src/backend/utils/cache/relcache.c | 242 +++++++++++++++++++++++++---
src/include/catalog/pg_publication.h | 2 +-
src/include/parser/parse_node.h | 1 -
src/include/utils/rel.h | 7 +
src/include/utils/relcache.h | 1 +
src/test/regress/expected/publication.out | 231 +++++++++++++++++++++-----
src/test/regress/sql/publication.sql | 178 +++++++++++++++++---
src/test/subscription/t/027_row_filter.pl | 7 +-
src/tools/pgindent/typedefs.list | 1 +
17 files changed, 788 insertions(+), 174 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 0929aa0..17bc57b 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -33,9 +33,11 @@
#include "catalog/pg_publication_namespace.h"
#include "catalog/pg_publication_rel.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
#include "commands/publicationcmds.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
@@ -112,6 +114,137 @@ check_publication_add_schema(Oid schemaid)
}
/*
+ * Is this a simple Node permitted within a row filter expression?
+ */
+static bool
+IsRowFilterSimpleExpr(Node *node)
+{
+ switch (nodeTag(node))
+ {
+ case T_ArrayExpr:
+ case T_BooleanTest:
+ case T_BoolExpr:
+ case T_CaseExpr:
+ case T_CaseTestExpr:
+ case T_CoalesceExpr:
+ case T_Const:
+ case T_List:
+ case T_MinMaxExpr:
+ case T_NullIfExpr:
+ case T_NullTest:
+ case T_ScalarArrayOpExpr:
+ case T_XmlExpr:
+ return true;
+ default:
+ return false;
+ }
+}
+
+/*
+ * The row filter walker checks if the row filter expression is a "simple
+ * expression".
+ *
+ * It allows only simple or compound expressions such as:
+ * - "(Var Op Const)" or
+ * - "(Var Op Var)" or
+ * - "(Var Op Const) Bool (Var Op Const)"
+ * - etc
+ * (where Var is a column of the table this filter belongs to)
+ *
+ * Specifically,
+ * - User-defined operators are not allowed.
+ * - User-defined functions are not allowed.
+ * - User-defined types are not allowed.
+ * - Non-immutable builtin functions are not allowed.
+ * - System columns are not allowed.
+ *
+ * Notes:
+ *
+ * We don't allow user-defined functions/operators/types because (a) if the user
+ * drops such a user-definition or if there is any other error via its function,
+ * the walsender won't be able to recover from such an error even if we fix the
+ * function's problem because a historic snapshot is used to access the
+ * row-filter; (b) any other table could be accessed via a function, which won't
+ * work because of historic snapshots in logical decoding environment.
+ *
+ * We don't allow anything other than immutable built-in functions because
+ * non-immutable functions can access the database and would lead to the problem
+ * (b) mentioned in the previous paragraph.
+ */
+static bool
+rowfilter_walker(Node *node, Relation relation)
+{
+ char *errdetail_msg = NULL;
+
+ if (node == NULL)
+ return false;
+
+
+ if (IsRowFilterSimpleExpr(node))
+ {
+ /* OK, node is part of simple expressions */
+ }
+ else if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+
+ /* User-defined types not allowed. */
+ if (var->vartype >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined types are not allowed");
+
+ /* System columns not allowed. */
+ else if (var->varattno < InvalidAttrNumber)
+ {
+ Oid relid = RelationGetRelid(relation);
+ const char *colname = get_attname(relid, var->varattno, false);
+
+ errdetail_msg = psprintf(_("Cannot use system column (%s)."), colname);
+ }
+ }
+ else if (IsA(node, OpExpr))
+ {
+ /* OK, except user-defined operators are not allowed. */
+ if (((OpExpr *) node)->opno >= FirstNormalObjectId)
+ errdetail_msg = _("User-defined operators are not allowed.");
+ }
+ else if (IsA(node, FuncExpr))
+ {
+ Oid funcid = ((FuncExpr *) node)->funcid;
+ const char *funcname = get_func_name(funcid);
+
+ /*
+ * User-defined functions are not allowed.
+ * System-functions that are not IMMUTABLE are not allowed.
+ */
+ if (funcid >= FirstNormalObjectId)
+ errdetail_msg = psprintf(_("User-defined functions are not allowed (%s)."),
+ funcname);
+ else if (func_volatile(funcid) != PROVOLATILE_IMMUTABLE)
+ errdetail_msg = psprintf(_("Non-immutable built-in functions are not allowed (%s)."),
+ funcname);
+ }
+ else
+ {
+ elog(DEBUG1, "the row filter contained something unexpected: %s", nodeToString(node));
+
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("Expressions only allow columns, constants and some built-in functions and operators.")
+ ));
+ }
+
+ if (errdetail_msg)
+ ereport(ERROR,
+ (errmsg("invalid publication WHERE expression for relation \"%s\"",
+ RelationGetRelationName(relation)),
+ errdetail("%s", errdetail_msg)
+ ));
+
+ return expression_tree_walker(node, rowfilter_walker, (void *)relation);
+}
+
+/*
* Returns if relation represented by oid and Form_pg_class entry
* is publishable.
*
@@ -241,10 +374,6 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(result);
}
-/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
- */
List *
GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
Oid relid)
@@ -298,7 +427,7 @@ GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
addNSItemToQuery(pstate, nsitem, false, true, true);
whereclause = transformWhereClause(pstate, copyObject(pri->whereClause),
- EXPR_KIND_PUBLICATION_WHERE,
+ EXPR_KIND_WHERE,
"PUBLICATION WHERE");
/* Fix up collation information */
@@ -309,6 +438,37 @@ GetTransformedWhereClause(ParseState *pstate, PublicationRelInfo *pri,
}
/*
+ * Check if any of the ancestors are published in the publication. If so,
+ * return the relid of the topmost ancestor that is published via this
+ * publication, otherwise InvalidOid.
+ */
+Oid
+GetTopMostAncestorInPublication(Oid puboid, List *ancestors)
+{
+ ListCell *lc;
+ Oid topmost_relid = InvalidOid;
+
+ /*
+ * Find the "topmost" ancestor that is in this
+ * publication.
+ */
+ foreach(lc, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc);
+
+ if (list_member_oid(GetRelationPublications(ancestor),
+ puboid) ||
+ list_member_oid(GetSchemaPublications(get_rel_namespace(ancestor)),
+ puboid))
+ {
+ topmost_relid = ancestor;
+ }
+ }
+
+ return topmost_relid;
+}
+
+/*
* Insert new publication / relation mapping.
*/
ObjectAddress
@@ -362,6 +522,13 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
* collation information.
*/
whereclause = GetTransformedWhereClause(pstate, pri, true);
+
+ /*
+ * Walk the parse-tree of this publication row filter expression and
+ * throw an error if anything not permitted or unexpected is
+ * encountered.
+ */
+ rowfilter_walker(whereclause, targetrel);
}
/* Form a tuple. */
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 574d7d2..42c5dbe 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -568,14 +568,46 @@ void
CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
{
PublicationActions *pubactions;
+ AttrNumber bad_rfcolnum;
/* We only need to do checks for UPDATE and DELETE. */
if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
return;
+ if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ return;
+
+ bad_rfcolnum = GetRelationPublicationInfo(rel, true);
+
+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns referenced in
+ * the row filters from publications which the relation is in are valid,
+ * which means all referenced columns are part of REPLICA IDENTITY, or the
+ * table do not publish UPDATES or DELETES.
+ */
+ if (AttributeNumberIsValid(bad_rfcolnum))
+ {
+ const char *colname = get_attname(RelationGetRelid(rel),
+ bad_rfcolnum, false);
+
+ if (cmd == CMD_UPDATE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot update table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ else if (cmd == CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot delete from table \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail("Column \"%s\" used in the publication WHERE expression is not part of the replica identity.",
+ colname)));
+ }
+
/* If relation has replica identity we are always good. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
- OidIsValid(RelationGetReplicaIndex(rel)))
+ if (OidIsValid(RelationGetReplicaIndex(rel)))
return;
/*
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 193c87d..7d829a0 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -551,13 +551,6 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
err = _("grouping operations are not allowed in COPY FROM WHERE conditions");
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- if (isAgg)
- err = _("aggregate functions are not allowed in publication WHERE expressions");
- else
- err = _("grouping operations are not allowed in publication WHERE expressions");
-
- break;
case EXPR_KIND_CYCLE_MARK:
errkind = true;
@@ -950,9 +943,6 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("window functions are not allowed in publication WHERE expressions");
- break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d43839..2d1a477 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -200,19 +200,8 @@ transformExprRecurse(ParseState *pstate, Node *expr)
break;
case T_FuncCall:
- {
- /*
- * Forbid functions in publication WHERE condition
- */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("functions are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, exprLocation(expr))));
-
- result = transformFuncCall(pstate, (FuncCall *) expr);
- break;
- }
+ result = transformFuncCall(pstate, (FuncCall *) expr);
+ break;
case T_MultiAssignRef:
result = transformMultiAssignRef(pstate, (MultiAssignRef *) expr);
@@ -515,7 +504,6 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_COPY_WHERE:
case EXPR_KIND_GENERATED_COLUMN:
case EXPR_KIND_CYCLE_MARK:
- case EXPR_KIND_PUBLICATION_WHERE:
/* okay */
break;
@@ -1776,9 +1764,6 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_GENERATED_COLUMN:
err = _("cannot use subquery in column generation expression");
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("cannot use subquery in publication WHERE expression");
- break;
/*
* There is intentionally no default: case here, so that the
@@ -3099,8 +3084,6 @@ ParseExprKindName(ParseExprKind exprKind)
return "GENERATED AS";
case EXPR_KIND_CYCLE_MARK:
return "CYCLE";
- case EXPR_KIND_PUBLICATION_WHERE:
- return "publication expression";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 29bebb7..542f916 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2655,9 +2655,6 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CYCLE_MARK:
errkind = true;
break;
- case EXPR_KIND_PUBLICATION_WHERE:
- err = _("set-returning functions are not allowed in publication WHERE expressions");
- break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index 29f8835..bc34a23 100644
--- a/src/backend/parser/parse_oper.c
+++ b/src/backend/parser/parse_oper.c
@@ -718,13 +718,6 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
opform->oprright)),
parser_errposition(pstate, location)));
- /* Check it's not a custom operator for publication WHERE expressions */
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE && opform->oid >= FirstNormalObjectId)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("user-defined operators are not allowed in publication WHERE expressions"),
- parser_errposition(pstate, location)));
-
/* Do typecasting and build the expression tree */
if (ltree == NULL)
{
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 036d9c6..c5c3f26 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3538,20 +3538,11 @@ errorMissingRTE(ParseState *pstate, RangeVar *relation)
rte->eref->aliasname)),
parser_errposition(pstate, relation->location)));
else
- {
- if (pstate->p_expr_kind == EXPR_KIND_PUBLICATION_WHERE)
- ereport(ERROR,
- (errcode(ERRCODE_UNDEFINED_TABLE),
- errmsg("publication WHERE expression invalid reference to table \"%s\"",
- relation->relname),
- parser_errposition(pstate, relation->location)));
-
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("missing FROM-clause entry for table \"%s\"",
relation->relname),
parser_errposition(pstate, relation->location)));
- }
}
/*
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 2fa08e7..f42e7af 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1572,26 +1572,17 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
*/
if (am_partition)
{
- List *ancestors = get_partition_ancestors(relid);
- ListCell *lc2;
-
- /*
- * Find the "topmost" ancestor that is in this
- * publication.
- */
- foreach(lc2, ancestors)
+ Oid ancestor;
+ List *ancestors = get_partition_ancestors(relid);
+
+ ancestor = GetTopMostAncestorInPublication(pub->oid,
+ ancestors);
+
+ if (ancestor != InvalidOid)
{
- Oid ancestor = lfirst_oid(lc2);
-
- if (list_member_oid(GetRelationPublications(ancestor),
- pub->oid) ||
- list_member_oid(GetSchemaPublications(get_rel_namespace(ancestor)),
- pub->oid))
- {
- ancestor_published = true;
- if (pub->pubviaroot)
- publish_as_relid = ancestor;
- }
+ ancestor_published = true;
+ if (pub->pubviaroot)
+ publish_as_relid = ancestor;
}
}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 105d8d4..0dd871a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -56,6 +56,7 @@
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_publication.h"
+#include "catalog/pg_publication_rel.h"
#include "catalog/pg_rewrite.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic_ext.h"
@@ -71,6 +72,8 @@
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/optimizer.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_relation.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rowsecurity.h"
#include "storage/lmgr.h"
@@ -84,6 +87,7 @@
#include "utils/memutils.h"
#include "utils/relmapper.h"
#include "utils/resowner_private.h"
+#include "utils/ruleutils.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
@@ -267,6 +271,19 @@ typedef struct opclasscacheent
static HTAB *OpClassCache = NULL;
+/*
+ * Information used to validate the columns in the row filter expression. see
+ * rowfilter_column_walker for details.
+ */
+typedef struct rf_context
+{
+ AttrNumber invalid_rfcolnum; /* invalid column number */
+ Bitmapset *bms_replident; /* bitset of replica identity col indexes */
+ bool pubviaroot; /* true if we are validating the parent
+ * relation's row filter */
+ Oid relid; /* relid of the relation */
+ Oid parentid; /* relid of the parent relation */
+} rf_context;
/* non-export function prototypes */
@@ -5521,39 +5538,91 @@ RelationGetExclusionInfo(Relation indexRelation,
MemoryContextSwitchTo(oldcxt);
}
+
+
/*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row-filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcolnum.
*/
-struct PublicationActions *
-GetRelationPublicationActions(Relation relation)
+static bool
+rowfilter_column_walker(Node *node, rf_context *context)
{
- List *puboids;
- ListCell *lc;
- MemoryContext oldcxt;
- Oid schemaid;
- PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, Var))
+ {
+ Var *var = (Var *) node;
+ AttrNumber attnum = var->varattno;
+
+ /*
+ * If pubviaroot is true, we are validating the row filter of the
+ * parent table, but we can only use the bitset of replica identity col
+ * indexes in the child table to check. So, we need to convert the
+ * column number in the row filter expression to the child table's in
+ * case the column order of the parent table is different from the
+ * child table's.
+ */
+ if (context->pubviaroot)
+ {
+ char *colname = get_attname(context->parentid, attnum, false);
+ attnum = get_attnum(context->relid, colname);
+ }
+
+ if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
+ context->bms_replident))
+ {
+ context->invalid_rfcolnum = attnum;
+ return true;
+ }
+ }
+
+ return expression_tree_walker(node, rowfilter_column_walker,
+ (void *) context);
+}
+
+/*
+ * Get the publication information for the given relation.
+ *
+ * Traverse all the publications which the relation is in to get the
+ * publication actions. If the publication actions include UPDATE or DELETE and
+ * validate_rowfilter is true, then validate that if all columns referenced in
+ * the row filter expression are part of REPLICA IDENTITY.
+ *
+ * If not all the row filter columns are part of REPLICA IDENTITY, return the
+ * invalid column number, otherwise InvalidAttrNumber.
+ */
+AttrNumber
+GetRelationPublicationInfo(Relation relation, bool validate_rowfilter)
+{
+ List *puboids;
+ ListCell *lc;
+ MemoryContext oldcxt;
+ Oid schemaid;
+ List *ancestors = NIL;
+ Oid relid = RelationGetRelid(relation);
+ rf_context context = { 0 };
+ PublicationActions pubactions = { 0 };
+ bool rfcol_valid = true;
+ AttrNumber invalid_rfcolnum = InvalidAttrNumber;
/*
* If not publishable, it publishes no actions. (pgoutput_change() will
* ignore it.)
*/
- if (!is_publishable_relation(relation))
- return pubactions;
-
- if (relation->rd_pubactions)
- return memcpy(pubactions, relation->rd_pubactions,
- sizeof(PublicationActions));
+ if (!is_publishable_relation(relation) || relation->rd_rfcol_valid)
+ return invalid_rfcolnum;
/* Fetch the publication membership info. */
- puboids = GetRelationPublications(RelationGetRelid(relation));
+ puboids = GetRelationPublications(relid);
schemaid = RelationGetNamespace(relation);
puboids = list_concat_unique_oid(puboids, GetSchemaPublications(schemaid));
if (relation->rd_rel->relispartition)
{
/* Add publications that the ancestors are in too. */
- List *ancestors = get_partition_ancestors(RelationGetRelid(relation));
- ListCell *lc;
+ ancestors = get_partition_ancestors(relid);
foreach(lc, ancestors)
{
@@ -5568,6 +5637,20 @@ GetRelationPublicationActions(Relation relation)
}
puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
+ /*
+ * Find what are the cols that are part of the REPLICA IDENTITY.
+ * Note that REPLICA IDENTITY DEFAULT means primary key or nothing.
+ */
+ if (validate_rowfilter)
+ {
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_PRIMARY_KEY);
+ else if (relation->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ context.bms_replident = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_IDENTITY_KEY);
+ }
+
foreach(lc, puboids)
{
Oid pubid = lfirst_oid(lc);
@@ -5581,34 +5664,140 @@ GetRelationPublicationActions(Relation relation)
pubform = (Form_pg_publication) GETSTRUCT(tup);
- pubactions->pubinsert |= pubform->pubinsert;
- pubactions->pubupdate |= pubform->pubupdate;
- pubactions->pubdelete |= pubform->pubdelete;
- pubactions->pubtruncate |= pubform->pubtruncate;
+ pubactions.pubinsert |= pubform->pubinsert;
+ pubactions.pubupdate |= pubform->pubupdate;
+ pubactions.pubdelete |= pubform->pubdelete;
+ pubactions.pubtruncate |= pubform->pubtruncate;
ReleaseSysCache(tup);
/*
- * If we know everything is replicated, there is no point to check for
- * other publications.
+ * If the publication action include UPDATE and DELETE and
+ * validate_rowfilter flag is true, validates that any columns
+ * referenced in the filter expression are part of REPLICA IDENTITY
+ * index.
+ *
+ * FULL means all cols are in the REPLICA IDENTITY, so all cols are
+ * allowed in the row-filter and we can skip the validation.
+ *
+ * If we already found the column in row filter which is not part of
+ * REPLICA IDENTITY index, skip the validation too.
+ */
+ if (validate_rowfilter &&
+ (pubform->pubupdate || pubform->pubdelete) &&
+ relation->rd_rel->relreplident != REPLICA_IDENTITY_FULL &&
+ rfcol_valid)
+ {
+ HeapTuple rftuple;
+ Oid publish_as_relid = InvalidOid;
+
+ /*
+ * For a partition, if pubviaroot is true, check if any of the
+ * ancestors are published. If so, note down the topmost ancestor
+ * that is published via this publication, the row filter
+ * expression on which will be used to filter the partition's
+ * changes. We could have got the topmost ancestor when collecting
+ * the publication oids, but that will make the code more
+ * complicated.
+ */
+ if (pubform->pubviaroot && relation->rd_rel->relispartition)
+ {
+ if (pubform->puballtables)
+ publish_as_relid = llast_oid(ancestors);
+ else
+ publish_as_relid = GetTopMostAncestorInPublication(pubform->oid,
+ ancestors);
+ }
+
+ if (publish_as_relid == InvalidOid)
+ publish_as_relid = relid;
+
+ rftuple = SearchSysCache2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(publish_as_relid),
+ ObjectIdGetDatum(pubid));
+
+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node *rfnode;
+
+ context.pubviaroot = pubform->pubviaroot;
+ context.parentid = publish_as_relid;
+ context.relid = relid;
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !rowfilter_column_walker(rfnode, &context);
+ invalid_rfcolnum = context.invalid_rfcolnum;
+ pfree(rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }
+ }
+
+ /*
+ * If we know everything is replicated and some columns are not part of
+ * replica identity, there is no point to check for other publications.
*/
- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if (pubactions.pubinsert && pubactions.pubupdate &&
+ pubactions.pubdelete && pubactions.pubtruncate &&
+ (!validate_rowfilter || !rfcol_valid))
break;
}
+ bms_free(context.bms_replident);
+
if (relation->rd_pubactions)
{
pfree(relation->rd_pubactions);
relation->rd_pubactions = NULL;
}
+ if (validate_rowfilter)
+ relation->rd_rfcol_valid = rfcol_valid;
+
/* Now save copy of the actions in the relcache entry. */
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_pubactions = palloc(sizeof(PublicationActions));
- memcpy(relation->rd_pubactions, pubactions, sizeof(PublicationActions));
+ memcpy(relation->rd_pubactions, &pubactions, sizeof(PublicationActions));
MemoryContextSwitchTo(oldcxt);
+ return invalid_rfcolnum;
+}
+
+/*
+ * Get publication actions for the given relation.
+ */
+struct PublicationActions *
+GetRelationPublicationActions(Relation relation)
+{
+ PublicationActions *pubactions = palloc0(sizeof(PublicationActions));
+
+ /*
+ * If not publishable, it publishes no actions. (pgoutput_change() will
+ * ignore it.)
+ */
+ if (!is_publishable_relation(relation))
+ return pubactions;
+
+ if (relation->rd_pubactions)
+ {
+ memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
+ return pubactions;
+ }
+
+ (void) GetRelationPublicationInfo(relation, false);
+ memcpy(pubactions, relation->rd_pubactions,
+ sizeof(PublicationActions));
+
return pubactions;
}
@@ -6163,6 +6352,7 @@ load_relcache_init_file(bool shared)
rel->rd_idattr = NULL;
rel->rd_hotblockingattr = NULL;
rel->rd_pubactions = NULL;
+ rel->rd_rfcol_valid = false;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
rel->rd_fkeyvalid = false;
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 96c55f6..9e197de 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -135,6 +135,6 @@ extern char *get_publication_name(Oid pubid, bool missing_ok);
extern Node *GetTransformedWhereClause(ParseState *pstate,
PublicationRelInfo *pri,
bool bfixupcollation);
-
+extern Oid GetTopMostAncestorInPublication(Oid puboid, List *ancestors);
#endif /* PG_PUBLICATION_H */
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index d58ae6a..ee17908 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -80,7 +80,6 @@ typedef enum ParseExprKind
EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
EXPR_KIND_CYCLE_MARK, /* cycle mark value */
- EXPR_KIND_PUBLICATION_WHERE /* WHERE condition for a table in PUBLICATION */
} ParseExprKind;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 3128127..27cec81 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -164,6 +164,13 @@ typedef struct RelationData
PublicationActions *rd_pubactions; /* publication actions */
/*
+ * true if the columns referenced in row filters from all the publications
+ * the relation is in are part of replica identity, or the publication
+ * actions do not include UPDATE and DELETE.
+ */
+ bool rd_rfcol_valid;
+
+ /*
* rd_options is set whenever rd_rel is loaded into the relcache entry.
* Note that you can NOT look into rd_rel for this data. NULL means "use
* defaults".
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 82316bb..9cc4a38 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -76,6 +76,7 @@ extern void RelationInitIndexAccessInfo(Relation relation);
/* caller must include pg_publication.h */
struct PublicationActions;
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
+extern AttrNumber GetRelationPublicationInfo(Relation relation, bool validate_rowfilter);
extern void RelationInitTableAccessMethod(Relation relation);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5a49003..d5bb70b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -243,18 +243,21 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tbl6 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -264,7 +267,7 @@ ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 200
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl2" WHERE ((c <> 'test'::text) AND (d < 5))
@@ -275,7 +278,7 @@ ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE ((e > 1000) AND (e < 2000))
@@ -286,7 +289,7 @@ ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e < 500)
Publication testpub5
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl3" WHERE ((e > 300) AND (e < 500))
@@ -308,43 +311,43 @@ Publications:
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
Publication testpub_syntax1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
"public.testpub_rf_tbl3" WHERE (e < 999)
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
Publication testpub_syntax2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
--------------------------+------------+---------+---------+---------+-----------+----------
- regress_publication_user | f | t | t | t | t | f
+ regress_publication_user | f | t | f | f | f | f
Tables:
"public.testpub_rf_tbl1"
- "testpub_rf_myschema.testpub_rf_tbl5" WHERE (h < 999)
+ "testpub_rf_schema1.testpub_rf_tbl5" WHERE (h < 999)
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
ERROR: syntax error at or near "WHERE"
-LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a =...
+LINE 1: ...ntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a =...
^
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
ERROR: WHERE clause for schema not allowed
-LINE 1: ...ax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf...
+LINE 1: ...tax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf...
^
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
@@ -353,43 +356,185 @@ ERROR: conflicting or redundant row-filters for "testpub_rf_tbl1"
RESET client_min_messages;
-- fail - aggregate functions not allowed in WHERE clause
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
-ERROR: functions are not allowed in publication WHERE expressions
+ERROR: aggregate functions are not allowed in WHERE
LINE 1: ...TION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
^
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
-ERROR: functions are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) ...
- ^
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
-ERROR: user-defined operators are not allowed in publication WHERE expressions
-LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
- ^
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3"
+DETAIL: User-defined operators are not allowed.
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: User-defined functions are not allowed (testpub_rf_func2).
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Non-immutable built-in functions are not allowed (random).
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+ERROR: invalid publication WHERE expression for relation "rf_bug"
+DETAIL: User-defined types are not allowed
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Expressions only allow columns, constants and some built-in functions and operators.
+-- fail - system columns are not allowed
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE ('(0,1)'::tid = ctid);
+ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl1"
+DETAIL: Cannot use system column (ctid).
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
ERROR: invalid use of WHERE row-filter in ALTER PUBLICATION ... DROP TABLE
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
-ERROR: cannot add relation "testpub_rf_myschema1.testpub_rf_tbl6" to publication
-DETAIL: Table's schema "testpub_rf_myschema1" is already part of the publication or part of the specified schema list.
+CREATE PUBLICATION testpub6 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub6 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tbl6 WHERE (i < 99);
+ERROR: cannot add relation "testpub_rf_schema2.testpub_rf_tbl6" to publication
+DETAIL: Table's schema "testpub_rf_schema2" is already part of the publication or part of the specified schema list.
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
-DROP PUBLICATION testpub7;
+DROP PUBLICATION testpub6;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "d" used in the publication WHERE expression is not part of the replica identity.
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "c" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_pk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_nopk"
+DETAIL: Column "a" used in the publication WHERE expression is not part of the replica identity.
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ERROR: cannot update table "rf_tbl_abcd_part_pk_1"
+DETAIL: Column "b" used in the publication WHERE expression is not part of the replica identity.
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
CREATE TABLE testpub_tbl4(a int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 47bdba8..a95c71b 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -138,12 +138,15 @@ CREATE TABLE testpub_rf_tbl1 (a integer, b text);
CREATE TABLE testpub_rf_tbl2 (c text, d integer);
CREATE TABLE testpub_rf_tbl3 (e integer);
CREATE TABLE testpub_rf_tbl4 (g text);
-CREATE SCHEMA testpub_rf_myschema;
-CREATE TABLE testpub_rf_myschema.testpub_rf_tbl5(h integer);
-CREATE SCHEMA testpub_rf_myschema1;
-CREATE TABLE testpub_rf_myschema1.testpub_rf_tbl6(i integer);
+CREATE TABLE testpub_rf_tbl5 (a xml);
+CREATE SCHEMA testpub_rf_schema1;
+CREATE TABLE testpub_rf_schema1.testpub_rf_tbl5 (h integer);
+CREATE SCHEMA testpub_rf_schema2;
+CREATE TABLE testpub_rf_schema2.testpub_rf_tbl6 (i integer);
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+-- Firstly, test using the option publish='insert' because the row filter
+-- validation of referenced columns is less strict than for delete/update.
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1, testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub5
ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND e < 2000);
@@ -162,53 +165,178 @@ RESET client_min_messages;
DROP PUBLICATION testpub_dplus_rf_yes, testpub_dplus_rf_no;
-- some more syntax tests to exercise other parser pathways
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999);
+CREATE PUBLICATION testpub_syntax1 FOR TABLE testpub_rf_tbl1, ONLY testpub_rf_tbl3 WHERE (e < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax1
DROP PUBLICATION testpub_syntax1;
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1, testpub_rf_schema1.testpub_rf_tbl5 WHERE (h < 999) WITH (publish = 'insert');
RESET client_min_messages;
\dRp+ testpub_syntax2
DROP PUBLICATION testpub_syntax2;
--- fail - schemas are not allowed WHERE row-filter
+-- fail - schemas don't allow WHERE clause
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema WHERE (a = 123);
-CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_myschema, testpub_rf_myschema WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1 WHERE (a = 123);
+CREATE PUBLICATION testpub_syntax3 FOR ALL TABLES IN SCHEMA testpub_rf_schema1, testpub_rf_schema1 WHERE (a = 123);
RESET client_min_messages;
--- fail - duplicate tables are not allowed if that table has any WHERE row-filters
+-- fail - duplicate tables are not allowed if that table has any WHERE clause
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1 WHERE (a = 1), testpub_rf_tbl1 WITH (publish = 'insert');
CREATE PUBLICATION testpub_dups FOR TABLE testpub_rf_tbl1, testpub_rf_tbl1 WHERE (a = 2) WITH (publish = 'insert');
RESET client_min_messages;
-- fail - aggregate functions not allowed in WHERE clause
ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e < AVG(e));
--- fail - functions disallowed
-ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
--- fail - user-defined operators disallowed
-CREATE FUNCTION testpub_rf_func(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
-CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer);
+-- fail - user-defined operators are not allowed
+CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL;
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer);
CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - user-defined functions are not allowed
+CREATE FUNCTION testpub_rf_func2() RETURNS integer AS $$ BEGIN RETURN 123; END; $$ LANGUAGE plpgsql;
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a >= testpub_rf_func2());
+-- fail - non-immutable functions are not allowed. random() is volatile.
+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl1 WHERE (a < random());
+-- ok - NULLIF is allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1,2) = a);
+-- ok - builtin operators are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IS NULL);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE ((a > 5) IS FALSE);
+-- ok - immutable builtin functions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl4 WHERE (length(g) < 6);
+-- fail - user-defined types disallowed
+CREATE TYPE rf_bug_status AS ENUM ('new', 'open', 'closed');
+CREATE TABLE rf_bug (id serial, description text, status rf_bug_status);
+CREATE PUBLICATION testpub6 FOR TABLE rf_bug WHERE (status = 'open') WITH (publish = 'insert');
+DROP TABLE rf_bug;
+DROP TYPE rf_bug_status;
+-- fail - row-filter expression is not simple
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE (a IN (SELECT generate_series(1,5)));
+-- fail - system columns are not allowed
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl1 WHERE ('(0,1)'::tid = ctid);
+-- ok - conditional expressions are allowed
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (a IS DOCUMENT);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl5 WHERE (xmlexists('//foo[text() = ''bar'']' PASSING BY VALUE a));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (NULLIF(1, 2) = a);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (CASE a WHEN 5 THEN true ELSE false END);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (COALESCE(b, 'foo') = 'foo');
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (GREATEST(a, 10) > 10);
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (a IN (2, 4, 6));
+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl1 WHERE (ARRAY[a] <@ ARRAY[2, 4, 6]);
-- fail - WHERE not allowed in DROP
-ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl1 WHERE (e < 27);
-- fail - cannot ALTER SET table which is a member of a pre-existing schema
SET client_min_messages = 'ERROR';
-CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
-ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tbl6 WHERE (i < 99);
+CREATE PUBLICATION testpub6 FOR ALL TABLES IN SCHEMA testpub_rf_schema2;
+ALTER PUBLICATION testpub6 SET ALL TABLES IN SCHEMA testpub_rf_schema2, TABLE testpub_rf_schema2.testpub_rf_tbl6 WHERE (i < 99);
RESET client_min_messages;
DROP TABLE testpub_rf_tbl1;
DROP TABLE testpub_rf_tbl2;
DROP TABLE testpub_rf_tbl3;
DROP TABLE testpub_rf_tbl4;
-DROP TABLE testpub_rf_myschema.testpub_rf_tbl5;
-DROP TABLE testpub_rf_myschema1.testpub_rf_tbl6;
-DROP SCHEMA testpub_rf_myschema;
-DROP SCHEMA testpub_rf_myschema1;
+DROP TABLE testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema1.testpub_rf_tbl5;
+DROP TABLE testpub_rf_schema2.testpub_rf_tbl6;
+DROP SCHEMA testpub_rf_schema1;
+DROP SCHEMA testpub_rf_schema2;
DROP PUBLICATION testpub5;
-DROP PUBLICATION testpub7;
+DROP PUBLICATION testpub6;
DROP OPERATOR =#>(integer, integer);
-DROP FUNCTION testpub_rf_func(integer, integer);
+DROP FUNCTION testpub_rf_func1(integer, integer);
+DROP FUNCTION testpub_rf_func2();
+
+-- ======================================================
+-- More row filter tests for validating column references
+CREATE TABLE rf_tbl_abcd_nopk(a int, b int, c int, d int);
+CREATE TABLE rf_tbl_abcd_pk(a int, b int, c int, d int, PRIMARY KEY(a,b));
+CREATE TABLE rf_tbl_abcd_part_pk (a int PRIMARY KEY, b int) PARTITION by RANGE (a);
+CREATE TABLE rf_tbl_abcd_part_pk_1 (b int, a int PRIMARY KEY);
+ALTER TABLE rf_tbl_abcd_part_pk ATTACH PARTITION rf_tbl_abcd_part_pk_1 FOR VALUES FROM (1) TO (10);
+
+-- Case 1. REPLICA IDENTITY DEFAULT (means use primary key or nothing)
+-- 1a. REPLICA IDENTITY is DEFAULT and table has a PK.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub6 FOR TABLE rf_tbl_abcd_pk WHERE (a > 99);
+RESET client_min_messages;
+-- ok - "a" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (b > 99);
+-- ok - "b" is a PK col
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
+-- fail - "d" is not part of the PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+-- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not part of REPLICA IDENTITY
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 2. REPLICA IDENTITY FULL
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is in REPLICA IDENTITY now even though not in PK
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- ok - "a" is in REPLICA IDENTITY now
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 3. REPLICA IDENTITY NOTHING
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY NOTHING
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Case 4. REPLICA IDENTITY INDEX
+ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
+ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY USING INDEX idx_abcd_pk_c;
+ALTER TABLE rf_tbl_abcd_nopk ALTER COLUMN c SET NOT NULL;
+CREATE UNIQUE INDEX idx_abcd_nopk_c ON rf_tbl_abcd_nopk(c);
+ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY USING INDEX idx_abcd_nopk_c;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (a > 99);
+-- fail - "a" is in PK but it is not part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
+-- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
+-- fail - "a" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
+-- ok - "c" is part of REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_nopk SET a = 1;
+
+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+
+DROP PUBLICATION testpub6;
+DROP TABLE rf_tbl_abcd_pk;
+DROP TABLE rf_tbl_abcd_nopk;
+DROP TABLE rf_tbl_abcd_part_pk;
+-- ======================================================
-- Test cache invalidation FOR ALL TABLES publication
SET client_min_messages = 'ERROR';
diff --git a/src/test/subscription/t/027_row_filter.pl b/src/test/subscription/t/027_row_filter.pl
index 64e71d0..de6b73d 100644
--- a/src/test/subscription/t/027_row_filter.pl
+++ b/src/test/subscription/t/027_row_filter.pl
@@ -19,6 +19,8 @@ $node_subscriber->start;
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_1 (a int primary key, b text)");
$node_publisher->safe_psql('postgres',
+ "ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;");
+$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_2 (c int primary key)");
$node_publisher->safe_psql('postgres',
"CREATE TABLE tab_rowfilter_3 (a int primary key, b boolean)");
@@ -280,9 +282,7 @@ is($result, qq(13|0|12), 'check replicated rows to tab_rowfilter_4');
# - INSERT (1700, 'test 1700') YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
# - UPDATE (1600, NULL) NO, row filter evaluates to false because NULL is not <> 'filtered'
# - UPDATE (1601, 'test 1601 updated') YES, because 1601 > 1000 and 'test 1601 updated' <> 'filtered'
-# - DELETE (1700) NO, row filter contains column b that is not part of
-# the PK or REPLICA IDENTITY and old tuple contains b = NULL, hence, row filter
-# evaluates to false
+# - DELETE (1700) YES, because 1700 > 1000 and 'test 1700' <> 'filtered'
#
$result =
$node_subscriber->safe_psql('postgres',
@@ -291,7 +291,6 @@ is($result, qq(1001|test 1001
1002|test 1002
1600|test 1600
1601|test 1601 updated
-1700|test 1700
1980|not filtered), 'check replicated rows to table tab_rowfilter_1');
# Publish using root partitioned table
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0c61ccb..89f3917 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3503,6 +3503,7 @@ replace_rte_variables_context
ret_type
rewind_source
rewrite_event
+rf_context
rijndael_ctx
rm_detail_t
role_auth_extra
--
2.7.2.windows.1
On Wednesday, December 8, 2021 2:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 6, 2021 at 6:04 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed the
documentation to reflect the current behavior. I probably missed somechanges
but I will fix in the next version.
I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluate the
row filter for UPDATEs. We already discussed it circa [1]. This current version
chooses *new* tuple. Is it the best choice?But with 0003, we are using both the tuple for evaluating the row
filter, so instead of fixing 0001, why we don't just merge 0003 with
0001? I mean eventually, 0003 is doing what is the agreed behavior,
i.e. if just OLD is matching the filter then convert the UPDATE to
DELETE OTOH if only new is matching the filter then convert the UPDATE
to INSERT. Do you think that even we merge 0001 and 0003 then also
there is an open issue regarding which row to select for the filter?Maybe I was not clear. IIUC we are still discussing 0003 and I would like to
propose a different default based on the conclusion I came up. If we merged
0003, that's fine; this change will be useless. If we don't or it is optional,
it still has its merit.Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm still
processing if it is worth it. If you think that in general the row filter
contains the primary key and it is rare to change it, it will waste cycles
evaluating the same expression twice. It seems this behavior could be
controlled by a parameter.I think the first thing we should do in this regard is to evaluate the
performance for both cases (when we apply a filter to both tuples vs.
to one of the tuples). In case the performance difference is
unacceptable, I think it would be better to still compare both tuples
as default to avoid data inconsistency issues and have an option to
allow comparing one of the tuples.
I did some performance tests to see if 0003 patch has much overhead.
With which I compared applying first two patches and applying first three patches in four cases:
1) only old rows match the filter.
2) only new rows match the filter.
3) both old rows and new rows match the filter.
4) neither old rows nor new rows match the filter.
0003 patch checks both old rows and new rows, and without 0003 patch, it only
checks either old or new rows. We want to know whether it would take more time
if we check the old rows.
I ran the tests in asynchronous mode and compared the SQL execution time. I also
tried some complex filters, to see if the difference could be more obvious.
The result and the script are attached.
I didn’t see big difference between the result of applying 0003 patch and the
one not in all cases. So I think 0003 patch doesn’t have much overhead.
Regards,
Tang
On Mon, Dec 20, 2021 at 6:07 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Sat, Dec 18, 2021 at 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think it's a concern, for such a basic example with only one row,
getting unpredictable (and even wrong) replication results, depending
upon the order of operations.I am not sure how we can deduce that. The results are based on current
and new values of row which is what I think we are expecting here.In the two simple cases presented, the publisher ends up with the same
single row (2,1) in both cases, but in one of the cases the subscriber
ends up with an extra row (1,1) that the publisher doesn't have. So,
in using a "filter", a new row has been published that the publisher
doesn't have. I'm not so sure a user would be expecting that. Not to
mention that if (1,1) is subsequently INSERTed on the publisher side,
it will result in a duplicate key error on the publisher.
Personally, I feel users need to be careful in defining publications
and subscriptions, otherwise, there are various ways "duplicate key
error .." kind of issues can arise. Say, you different publications
which publish the same table, and then you have different
subscriptions on the subscriber which subscribe to those publications.
Doesn't this problem result from allowing different WHERE clauses for
different pubactions for the same table?
My current thoughts are that this shouldn't be allowed, and also WHERE
clauses for INSERTs should, like UPDATE and DELETE, be restricted to
using only columns covered by the replica identity or primary key.Hmm, even if we do that one could have removed the insert row filter
by the time we are evaluating the update. So, we will get the same
result. I think the behavior in your example is as we expect as per
the specs defined by the patch and I don't see any problem, in this
case, w.r.t replication results. Let us see what others think on this?Here I'm talking about the typical use-case of setting the
row-filtering WHERE clause up-front and not changing it thereafter.
I think that dynamically changing filters after INSERT/UPDATE/DELETE
operations is not the typical use-case, and IMHO it's another thing
entirely (could result in all kinds of unpredictable, random results).
Yeah, that's what I also wanted to say that but users need to
carefully define publications/subscriptions, otherwise, with up-front
definition also leads to unpredictable results as shared in the
explanation above. I feel Hou-San's latest email [1]/messages/by-id/OS0PR01MB57168F4384D50656A4FC2DC5947B9@OS0PR01MB5716.jpnprd01.prod.outlook.com explains the
current rules very well and maybe we should document them in some way
to avoid confusion.
Personally I think it would make more sense to:
1) Disallow different WHERE clauses on the same table, for different pubactions.
2) If only INSERTs are being published, allow any column in the WHERE
clause, otherwise (as for UPDATE and DELETE) restrict the referenced
columns to be part of the replica identity or primary key.
We can restrict in some way like you are saying or we can even
restrict such that we "disallow specifying row filters unless
pubactions have all the dml operations and allow row filter to have
columns that are part of replica identity or primary key". I feel it
is better to provide flexibility as the current patch does and
document it to make users aware of the kind of problems that can arise
with the wrong usage.
[1]: /messages/by-id/OS0PR01MB57168F4384D50656A4FC2DC5947B9@OS0PR01MB5716.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
On Mon, Dec 20, 2021 at 12:51 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
I think it might not be hard to predict the current behavior. User only need to be
aware of that:
1) pubaction and row filter on different publications are combined with 'OR'.
2) FOR UPDATE, we execute the fiter for both OLD and NEW tuple and would change
the operation type accordingly.For the example mentioned:
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');If we follow the rule 1) and 2), I feel we are able to predict the following
conditions:
--
WHERE (action = 'insert' AND b < 2) OR (action = 'update' AND a > 1)
--So, it seems acceptable to me.
Personally, I think the current design could give user more flexibility to
handle some complex scenario. If user want some simple setting for publication,
they can also set same row filter for the same table in different publications.
To avoid confusion, I think we can document about these rules clearly.BTW, From the document of IBM, I think IBM also support this kind of complex
condition [1].
[1] https://www.ibm.com/docs/en/idr/11.4.0?topic=rows-log-record-variables
Yes, I agree with this. It's better to give users more flexibility
while warning him on what the consequences are rather than restricting
him with constraints.
We could explain this in the documentation so that users can better
predict the effect of having pubaction specific filters.
regards,
Ajin Cherian
Fujitsu Australia
On Thu, Dec 2, 2021 at 7:40 PM vignesh C <vignesh21@gmail.com> wrote:
...
Thanks for the updated patch, few comments: 1) Both testpub5a and testpub5c publication are same, one of them can be removed +SET client_min_messages = 'ERROR'; +CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1) WITH (publish="insert"); +CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1; +CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3) WITH (publish="insert"); +RESET client_min_messages; +\d+ testpub_rf_tbl1 +DROP PUBLICATION testpub5a, testpub5b, testpub5c;testpub5b will be covered in the earlier existing case above:
ALTER PUBLICATION testpib_ins_trunct ADD TABLE pub_test.testpub_nopk,
testpub_tbl1;\d+ pub_test.testpub_nopk
\d+ testpub_tbl1I felt test related to testpub5b is also not required
Skipped. Strictly speaking you may be correct to say this code path is
already tested elsewhere. But this test case was meant for \d+ so I
wanted it to be "self-contained" and easy to observe it displaying
both with and without a filters both at the same time.
3) testpub7 can be renamed to testpub6 to maintain the continuity since the previous testpub6 did not succeed: +CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer, RIGHTARG = integer); +CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27); +-- fail - WHERE not allowed in DROP +ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27); +-- fail - cannot ALTER SET table which is a member of a pre-existing schema +SET client_min_messages = 'ERROR'; +CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1; +ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16; +RESET client_min_messages;
Fixed in v48 [1]/messages/by-id/CAHut+PuHz1oFM7oaiHeqxMQqd0L70bV_hT7u_mDf3b8As5kwig@mail.gmail.com
4) Did this test intend to include where clause in testpub_rf_tb16, if so it can be added: +-- fail - cannot ALTER SET table which is a member of a pre-existing schema +SET client_min_messages = 'ERROR'; +CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1; +ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16; +RESET client_min_messages;
Fixed in v48 [1]/messages/by-id/CAHut+PuHz1oFM7oaiHeqxMQqd0L70bV_hT7u_mDf3b8As5kwig@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PuHz1oFM7oaiHeqxMQqd0L70bV_hT7u_mDf3b8As5kwig@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Nov 24, 2021 at 3:22 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
3) Should we include row filter condition in pg_publication_tables
view like in describe publication(\dRp+) , since the prqual is not
easily readable in pg_publication_rel table:
How about exposing pubdef (or publicationdef) column via
pg_publication_tables? In this, we will display the publication
definition. This is similar to what we do for indexes via pg_indexes
view:
postgres=# select * from pg_indexes where tablename like '%t1%';
schemaname | tablename | indexname | tablespace | indexdef
------------+-----------+-----------+------------+-------------------------------------------------------------------
public | t1 | idx_t1 | | CREATE INDEX idx_t1 ON public.t1 USING btree
(c1) WHERE (c1 < 10)
(1 row)
The one advantage I see with this is that we will avoid adding
additional columns for the other patches like "column filter". Also,
it might be convenient for users. What do you think?
--
With Regards,
Amit Kapila.
On Mon, Dec 20, 2021 at 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 24, 2021 at 3:22 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
3) Should we include row filter condition in pg_publication_tables
view like in describe publication(\dRp+) , since the prqual is not
easily readable in pg_publication_rel table:How about exposing pubdef (or publicationdef) column via
pg_publication_tables? In this, we will display the publication
definition. This is similar to what we do for indexes via pg_indexes
view:
postgres=# select * from pg_indexes where tablename like '%t1%';
schemaname | tablename | indexname | tablespace | indexdef
------------+-----------+-----------+------------+-------------------------------------------------------------------
public | t1 | idx_t1 | | CREATE INDEX idx_t1 ON public.t1 USING btree
(c1) WHERE (c1 < 10)
(1 row)The one advantage I see with this is that we will avoid adding
additional columns for the other patches like "column filter". Also,
it might be convenient for users. What do you think?
I think it is a good idea, particularly since there are already some precedents.
OTOH maybe there is no immediate requirement for this feature because
there are already alternative ways to conveniently display the filters
(e.g. psql \d+ and \dRp+).
Currently, there is no pg_get_pubdef function (analogous to the
index's pg_get_indexdef) so that would need to be written from
scratch.
So I feel this is a good feature, but it could be implemented as an
independent patch in another thread.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Monday, December 20, 2021 11:24 AM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com>
On Wednesday, December 8, 2021 2:29 PM Amit Kapila
<amit.kapila16@gmail.com> wrote:On Mon, Dec 6, 2021 at 6:04 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira <euler@eulerto.com> wrote:
On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
PS> I will update the commit message in the next version. I barely changed
the
documentation to reflect the current behavior. I probably missed some
changes
but I will fix in the next version.
I realized that I forgot to mention a few things about the UPDATE behavior.
Regardless of 0003, we need to define which tuple will be used to evaluatethe
row filter for UPDATEs. We already discussed it circa [1]. This current version
chooses *new* tuple. Is it the best choice?But with 0003, we are using both the tuple for evaluating the row
filter, so instead of fixing 0001, why we don't just merge 0003 with
0001? I mean eventually, 0003 is doing what is the agreed behavior,
i.e. if just OLD is matching the filter then convert the UPDATE to
DELETE OTOH if only new is matching the filter then convert the UPDATE
to INSERT. Do you think that even we merge 0001 and 0003 then also
there is an open issue regarding which row to select for the filter?Maybe I was not clear. IIUC we are still discussing 0003 and I would like to
propose a different default based on the conclusion I came up. If we merged
0003, that's fine; this change will be useless. If we don't or it is optional,
it still has its merit.Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm still
processing if it is worth it. If you think that in general the row filter
contains the primary key and it is rare to change it, it will waste cycles
evaluating the same expression twice. It seems this behavior could be
controlled by a parameter.I think the first thing we should do in this regard is to evaluate the
performance for both cases (when we apply a filter to both tuples vs.
to one of the tuples). In case the performance difference is
unacceptable, I think it would be better to still compare both tuples
as default to avoid data inconsistency issues and have an option to
allow comparing one of the tuples.I did some performance tests to see if 0003 patch has much overhead.
With which I compared applying first two patches and applying first three patches
in four cases:
1) only old rows match the filter.
2) only new rows match the filter.
3) both old rows and new rows match the filter.
4) neither old rows nor new rows match the filter.0003 patch checks both old rows and new rows, and without 0003 patch, it only
checks either old or new rows. We want to know whether it would take more time
if we check the old rows.I ran the tests in asynchronous mode and compared the SQL execution time. I also
tried some complex filters, to see if the difference could be more obvious.The result and the script are attached.
I didn’t see big difference between the result of applying 0003 patch and the
one not in all cases. So I think 0003 patch doesn’t have much overhead.
In previous test, I ran 3 times and took the average value, which may be affected by
performance fluctuations.
So, to make the results more accurate, I tested them more times (10 times) and
took the average value. The result is attached.
In general, I can see the time difference is within 3.5%, which is in an reasonable
performance range, I think.
Regards,
Tang
Attachments:
update-performance-test-10-times.pngimage/png; name=update-performance-test-10-times.pngDownload
�PNG
IHDR O �]�/ IDATx��=���y��K�yq���H[��L��\'��O�p��4`#�� � �-`HEH�=�6<��`%$�G�:�v���\8y)��E���I=��(~�%>���9�X�}Y���B @ @ @ @ �(����8��<���}�&���?��Q+@�������ZEsN)P�������l�e���7=e�e����
/K��S~k��l�s�����,������@ @ @ @ @�2<^f��*@ @ @ @ N*@����� � � �� �J9���& ��%x,d/N�[��4���AS��e��c�^��i��_�c��P����[[W�>*z����}��0��>����J�9��[1���?�#� � � l"@�q�����M�.��K��+����e���~�^�#����{��H��~�KH��}��>���oM�c��7��uu8��Kh������2���%�zO>�_���W�@ @ @ �Rx,E7����H��!��"��+���e_�rK�����6$G�1��d���nu����$�OQ'�D @�����@ ��@y���T*���������v��~�\Y�����3�t��I��v5��f��U�,�K��t���L���I#�j���5:}��l��4ZO����:���QC�jQ�O~�cZ9:��n�w�Ie��1:�_z:�L�$i��2�>f���/�l�zD���vZ{���uMM��7\:��t�����O|�����:���������>j;�����c)8��W�"Q����}���5��X�/^���}D���u:���.��!�����]�����.'{F����4Ni�%;���ZM���W����xK�-�����jW���[T����i��@ @ @`{����m���p�u����U
�LFr�������<]��;�����@HK��>/�I]z7z���=/��h��?�I53SfJ������|������n��v�RvVoI�$0S������U�.�BSSg�q����MT�(Gg"����c[��'�����lI���;w������4[S��{v���"�O�VF{��S����5�I�'7��,7�t�Y��)��������j.��������un�Y�����q��4�ks��y��5��>6ElxM���w+�;�z�:J|����<]-�kb���]y�+�^���OA�k�Q���q�V�}\S��V����(��)���=��N�v���_>�m >#� � � o(g�q:�Q�Z>���r-�<�l��Z0c�=����(��,���<2�x���w��DK�wr���e���.eg�iu�����`���2_��e������l����|Q��y���J���D4���u�9�����������I���5�@YMj�z4r��J\�����6�����Yci�"3�>���~�����������+��������tf+J_�j���W?�uc.�WE|��wE���qhJ���pM-�|���m�q��N �������O� � � ���x���g������ebf=.����7-���u��6����m��Wj3+�#�piiJ���D��#�p���]��}
6i�M�?Jj���f?���J�g��M�Ab��nsn�j�� M}u�l,������%���=\SG�VO}�%gw&��z$Q9@ @ @�p�<������5]V�x�/�9_gf5����p��'3s,���G������G�`���C�=w){}
�r��X�j�U_$e&Qk(��\��� ���c��I>yq����-/��(wm_$�.��r�.a�ko2���f)����'�ojn�g�{O\v=e��nsnj�c����Ct�������f�D<�.i�9����h����k�������d�>���#9@ �r�������@�(�<V����Q������!�p���K;|8Dpnp��p �8�R��h�,|�LV��ir����\��_�ZO�����e�o�RvF��R�JMz���FI�}e�jT���sxoB�(�;[���HF/���X�����`��X]Z�k��.�%Cof�?�n�:y�
�H�5#�����{��G�������K��y�F����K�L��mo"��s�F�|�K�;y����k���U��+YY;�����N\G��e���.Fi�0���]S���Q!o�w)�����1
; ����2Q�� @ @ ��@e8�������W�#����7y����J��'�;��{.�@�O�w����>;������}�e9��j��F>��*8*n�A���V.��C���<r��3�[��u�@_fK�s ����>���"���N8��!J)��q�y��s�<���������!��1���'�<���{������1eY���3�q�]Z����������8o��#�D-tY�L�����:���3ub�u����a�e1���<����{��-�}���2��'�m�UF�$� #&c@ ��x||����@ ��|w�*P���{��!��Kn������a�>�\���U�{v��D:�o��l�!�f�otH�W��G���#*���vk���?Dr��7 � � �&���4�!� � � � ����{<6��7eBb@ l��_����]|F @ ��7� �D@ � xK���������@c��"�����2�m��.<�����8�<����
��o�;��e����
/K����*yS���7��]���'K��n�Q!@ @ @�4�s�����R@ �+@�q���� rF��;@ @ @ N%@��T��� � � � ��x���=��-Mi��Z=�uK4�2=h]���J�9�h�&�[�Y�H���v��p�y��p��S����=�B6 � �!���x� � !P����[�J��a�h�������u��'*/�i�t&���7���=�.�Znv^~Y�[����K�����i3��k��~�^�#�p�>$����%�zO>�_��x}��z"������U����[���������a��J�"k�dd���<�_�?���1mljV����w�� � � �@�Jxl
]q�
kl<��.W�LO��r�����'�������8��^��2�mju����$���PW��c�����~��2�q����8�9e�7�n{.��+�������n����tLc��d�59������Q@ @ ���p���5��c�\�n���M\G��o�"��s&����o���c�ki�| ��
�q�c��$���wM�a �%Q����� ���0�������lkz�0�d����u����������5��b]]V���������3���Ju�z���<6�kT#-�3�����/�FS� �E�m�u�J�?��)���Lmo���Zc ;��kN���)�>�2����pv[��A�&�u�x����u'�����0���~��"��2�{�\��
�=u�z�e}��m_�����?��Sb4��?��]�����m%������V��~�Nk�[X���0�S���Ws4z_9O��DB=���s:5�]i�K���w�?.��r��������z~����������|����-K���2�������F3�IQ��3W#�#����3R&�������ew��Vq�����@n��e��se"mk�LF���|������
�������V��j>z_��L�y;��;=]j�S�����KM#"�>V��%fyKX����v��-���`�������������R��y��,w�������L�������������\��42��F���$����1�R���Pn�Er�@V����)�.� ���I�a�����n�Jn�t����s&�����y}�lf=y������H����r���O���\�a��&�HC�����5F�B2�s�$�!��������I]z7z/����E�~�n+o9�(;�d��f�-�,/��q�(b���u�rJ����;�c+�L����i�{3��m@ @ (� ��p821K�[ W���Q[*���.� ��'���]����� ��oY&�Q������������r�;�wf��%z$F2�ng�=��iy��2�d�&�?d�7��q-�Gi9n3��"�N���I�W���n��3y}���Y})��R����vK�����2c���3���UVC5{�I-#��s b����vZ��E}�,�p�xA)s]W?�uc�2s��������q����)4��z�+�/���,�;2%Q�8��~v��i3�����]J�>o>��B[@ �K�����&�@ �
x\�of�td�� �c��t�y����
l}�F��K�{N[3k��&����6=/%��we��H�o[�qF��\�����?h r*��#��5�Fc(���9���y�����I�����e$g�%��j����YYi���^[�}�e��5F����������L���:#��������e�|��9�h�|uIM�Q���"Q�����^�tl�������@ @ .M����=�?Tu����~���h��r���pb�J���!����QC�j�y��{�������4����e��m�=��������g�U�2��I}���?���G�nP�5}�?��+���hw���2�����I�k�����)�1 o�{���5yo����d09�m�|��5�]�>y�t,s�)��K����������,��?�4}�wvMv�������5���>m�,?��q�(c��_������U9�c���l��B�� � � )SP�]�����R�^}���z'�5�,��T��V�s���k�^�"���m�W"���<�=�4_�GM�_�������tiu�l�$������������<c��]�Z��������g�{���������2���u4��zw�-��c������>"��o���#�_�����y�'���uZWN��?ol6>����:�s&/��<����k}6(�[*nr��YuK����ii��������<���1F������}��&��ix� q��R$�'4���&�p��]��uF�2�����I������Q[�k�wQ�z�</����������m�^-����ao&��M@ �� <>>��T@�|*�Tk� �hx�5}c��}�&���c.$/���K�<���?�������|��L�Rw�5�]����)���A�q�<t����������j��i��k�Z��h)�=�6����y��V�$������������,�iglt~��,|�P��,���Xo��(K���c�Hg��E���u-K2�q�X���
,"��)�g������}���
g��3���M��f����z"�T�`/o���oe�[F�~�����|��[���3�:���@�����I��j����j�e�;.������2O�)�����}�����@ @ @ )�]r�\�@��Y�g����G�[�������-�Z��iye$�%����=��%���2:v����y�:o�������M��7i
]9���I-9@ @ �O����'�@ @ @ @ ��x�xl6��o
@ 8��_�����B�@ 6����}�x!� �o�u�R��v�cP�G��>����nq������e�����~/��Y����,���.�����'��_�E��?�������?Yj�w�@ @ @ @ @`��H � � ��%<>>^B3h ���x<��� ��% ����z�� � � � � x���T � � � � �#@�1�C � � � � �� ����H� � � � � 9sp8� � � �I����L��� � �x``�G �(���2�:mF @ @ ��l!� � � � ��<��,@ @ @ �Kx||��f�@ �3 �x&A5@ �K�G�%�&mA @ @ v ����@ @ @ @ @ G��c�@ @ @ @ @`7����
@ @ @ @ r<��p@ @ (����m��K[@ ,@����d� �Q�-e�u�� � � �*���m6���l!� �������o���I@ @��
?�:e"� �+��6M��\�K ��������#�wnq��5-��q�Y���y�����pfY����Q��r�,�I;����y�����pfY����E��(���h�� � � Pn���Z� � � � ��A<��L@ @ @ @ (� ��7��T���L��G�_��1�p���>��? � � � Y�t��iW*��tK�[�����r���K;���?T^v�{�\�q�F��J��Js �p��R��G���/L;]�"M{ �� H� � �/P���p@`�' <N���;2q���*� �(�q
���i#a*��+Y����B���;�("�D���-��t�s�/����H�&�O�|,�]�;W~��k�P/@ @ @�x����&3KO���3��~�hr�T��x9hF���T��+n�8^c����]�m����{���8�R����&��Lz�h�Y����6i����G�MK�l�����.�(W LW��Q����w�I`���A3��qeN�di�Qy~��-��v�V��st�[�������n�����sM����������5����w��:���d�|'Uo�*����oO�2r���O(R��{g&O_�"�>�_�f/ � � � ��@"�����>���L(}���w,��~�^}��r����Z���:�a��D�ZV=�'��i"��^�������2����B���?kL�?km�H����>������"l)}�&�0_
��E&�x1��6�:�����|���I����]d��4{�y��q������zT��U�������x��If�wA0n$��{���#�v��h�_�LF��y��\������K����������R�d��>��;M����&��a�:O&����<�1�;��X�` � pY����� Z� pR�x�Qg05��c<�U06�����g��J�!+^Pg��Z0SR����I[�Q����S�Z�qf����]��zw/�j�L�D���d���O��H�YF<u��H�v�Qc��K�-{�$:aN�_�I��[J!����{�>����K=|.^f�tL�����hi����vx���i��v���|��P�]��Go��X`�+��ek��x�'e�����1'"� � � +<f5Sgr��21��}�j��~�L+; ��3!]�f�e�Q���F)���&5M�J����4�edfpd��J��%���[���?3�26\jm���������"���tZ-��59����0������##�F���K����5�#�Y��pYu����L�(�?A����\���+�5`��:���z�@ @ @ N/<���af��u[���q%�`��SOVC��%�~����Z����C��zj���k�����y��r ��\i�>�t�����O��j�[<��2r�k
e��K;�gb^^zL�>��'�����{\j�5����qW�s���2�������iV^;���9W�����5G���/����3��a{tf��EWb�mz���\�o�>�� � � � P��p���E���H�������c��+��:��������I�]�XF^^Y��0��?���x�'lg��wq\���s�r��ko�?q�
��u�����"�-���� �1}fU6<�i ��yi��I������u5h�&L���un�!���ej��{�{�;�cT/w�w"nDoC�M���6��[�������Q��6G��^���Ot���m]+zN�������4�� IDAT��+����Y ���gU*S|�����oRjhg*Kaw���������?��7�,�I;S/����?�u�/KV4��8��v���R�o������7H��K�<$g�������w��l��t)�
���^��k��_��x{{[��R�bl�w����%���)�~�����U���gV�/mY��v^���?��"
��Z���x�r�
��X�����xy�z���d�����S�q� A�uBG @ @ ._���o��-���'o���7����.���� ^�Sk���'�f��� � � � �`��1�)@ @ (���.� � �����l6��� � _�~�> � � �ox�:��"� g.�-����+g��5�+�MX�0p8���h)����`R�'P������<�IQ�,��>�8:��Lf���Y�^��3K��m8� � � � � ;
x��d � � � � �-@�1��# � � �@�nooK�^� pX�o��J������(zb�����>fY���� ���x��� � � ���i���T*�_s �s��[��2h6e�UCwIcW�������?g�#m{0��8�Z�a;]�"M�cqL��kW������v����_�z&����-C @ @ @`c���}������.�/=��?W��-�~9c��a+��J�=��"�����n���!�i��"�qG&N�{,����)�X`Q�(J����dK�F @ �#���X��RS@ ��X
<�f��Yz����Af�k�������^`���z�d�|'U��*�'@J.#����
�u���[u�2R���f3��*a�%��d�M�4���&���l]^Q�^��R��im� ����c��bc���h\���8��c9�r�@����;3y��9�1i
]q���id�]",�,��4� ��56�s�J��W��M�v��l�H?7^�h+^������0�������w�9X�~�+<���������l=N{@ @ @ �U�D�11��}4-��c�>�K�&X"=�$��$�Y���[����u0�������TT��|y�I���EGN�i$O�9h�D�3�Ws��2����B���?�N-km�H����>��N*������������/����V�����`y���H�{31��5��������<�C�8�M��
����������^��Ja;������U������6��If�wA�~$=
�y��#�v�������H�4�q�=7Y�=�� (�;~��c5@���p��7VD�����KR�� � � ��� ��:��q-S���YN�(�&��8��TY��%����Y���R��rp�-���#�����t���,4`������z6��48x��&�D2�d�'���3�d�K��1��t��w,K�-{�$:�P�_�I��[Z)Y�V��KM�R�V�.+�S��\^D�������� ��e&N�g��F������7b���sm���k��f���"�5�u@ @ @ @���������^]&f���/�i3��#c/�h-�u����x���24HW��&��UvQ���4"�&�lWj����r�����Qp��IZp/5��|�oi�`������03��2����N�hy�����G
�e�q,�,
p�d<Y����jI������G��q�����T�����~
f�~��6f�wZ
���yc%���C @ @ @�������4�l4����5���S�~��j�F�Y����T?^Kc4�Y�a�z?��������_O-h9x��L���<��,�0j�7I+�K�����<����O���'�?�f}�����P����{�������|z�N�^��R�����T����~�^�����z�X�h��.C��2���Z�j.����������y��d��+����"��go�����]i�Y�~�+��#� � E���-b��3 ���
�C�~-�
WD�
����N�O\q�i�]=?_�M������}��o�t�(U�|���y<���������:L��}��jR������7L:WW��y�FZph��9���$Mx�1�*�4y����x��z�iy���e�O
�� �$�r���o�++y�n��;��NV_�}�k���u�L;1��=�����~����1�_�:i����Q?����5G�=^�����/�(k���;���3�$��w|�\�?�<�>�����6Z�nY���{b?5,K�����s���<���O=���
<:��S���G�^�o������7�H�1��yH�)8e{��6���>9)�T�~��PK��"�|���(�ZE���xQ[���v�����?�"��.K��6�}Y��vn>&�p&�Y�^���e���R��}8�������p�RHyl���[Q�'yOa?v�P � �������'s& � k�[s������� �%l�1�\��g]�����W�?[Cw�=9/���[�?Z�GT�
{@ @ @ ��0��,=M;@ @ @ @ 8��w��f�y�")
@���~�*>|��f�>@ .N�����48����z������o.�,7�|3 � {�-e���*��P�,�i��� �����
�Y��,�����'���/�i�gA:j�j��?Yj����4@ @ @���G���0�C �+@������ �B�-��f� � � �
x��� � � � � �"@�q�0�T���L��2~��x�~L�c�u<AJB @ �x||�?�� � l!p���r��J��k�`�E�y�R�m��K�������?O��8j$F�v3������U*���!�i3�|�j��������s�W���qt���q����1'"� � � +p���T>=]��u�u]Y�Ez�����R���
L��p�����'7v QD��?�t���VP�T�����1o ���C��cE*��L��
.7
���i-(J�����v�@ @ @ �� �c�x�,=]vi� 3�,9�,Zz��N�j���N����e&�������H�m������9#�;���e�>5��f��U��K���n�4i���M�������\E|��im� ����c��bc���h\�����*�&u��t,#�^�� ����3��/K�C���|�WO"���,��v���������nJ���26e���^���kk(�g���$������k�XI\�e�#� � � P6�D�11��}4-3��c���n����O��O����1w(����@n���i� ��A��L�9���H��s����{�3�Ws�U�2����B���?�N��96~v[�I�OI�S��n�gk�
��`Y[d��7�0����#��/����|y�of#�i��o��`\���4�����h�l�6A�J4�^�vB�s��Z.pl����w��e�*�h���_<���$��� X?�����3qd�x�~q3�����r�MVp���YW�������5 iT�%k��\�5o�����$����@ @ @ �R��:��q-���V�Q�MZqFm�$��Kf=�3%5� �T�������aM�����}mI����Bvq������g3�M��I�nf����}�H�YFZ�}#i�AG��������gO�e�E�$}���_���x��(���Rk���[���{./��L���^�A�Vg;�udb�i�
gz{���n�����u�*�!`�l����x��B @ �^������H@ �#<f�[gH��21��}����~�&; ��3!����^�����/��\��l`��*5M��JfGN��2L��wG&i����Z�}_���^���c��>�}���������Z�3�>x����G
�e�q,�,
p�d<Y����jI������G��q�����T�����~����q��7"�������]�"�k�X��V���h)KO�N@ @ @ [ x�]I#-�x�Y�J����z���`�.)��'�����e��1T2k��^:�P��X%�]������-��i�3�����@F
�nC��������|�MN������de��Y_��4��5�E.�`�l^^zL�>��'�����.���E�t�qF�*��5��z=DOR�W�q�ca6����G������V���x�*s���� '�<�d�m��}��^�k�Z�"�k�XIz�� � � �@i���k���+"����_�G'��'�8��4������&av�c&���oXy��F)��{�0���yN����~�u�zL��}��jR������7��gi��E����Wq�4�9�������2M����4���^a�D^��{�>T�F0h�4��\7+y�n����
S������)g�Y}�c�"s������~����1�_�:i����Q?����5G�=^�����/�(���#�c���|F ��V������'����G�����F���-K�����{�aY��v�a��Q��u��R���h��q���.7����7y���
��i��c��A
N�^@g��;��h�ON�3�_�����/�w�������L/��Vk���m`Pm�Y��K���L�{�,�Y���e�O�Y���d����H����������3j_oy���-,J���w���M�Gy�^z+��$�)��x�� � � ��%����������<A��T��Y���,������]{O���xZ� � � �@� 3�]�� � � � � o����l6��I@ �|��U>|��� � ����_�;�@ �}xK�7{��>�;]e�i���)���w%j-��!P�������xe����>FIe�O�,�����'�<������?�g}�����,�>�h�@ @ @ @ J&@��dNs@ @ @ �,�����C�G @`k�[�� @`� ?Z� q@ @ �|�o���t+]��)��'��c���e}lQ@ @ @ (��I��AS*��t/>r��A�)��6Ce�4v�oM���|��y�-����^s��@���c��9z���}Q��X���>��@ @ @ @ ���.���������4�R.O`��He�� c(��^@?X��S�>\��u�u�����Dq&z,�7lyN���K�_�Ez7A���b�d@ @ @ �\`5���df���P{F����K�������L){���y���w��$@������y����g������g�OMz���j�L�lc�i�4��2��������k]^Q�^p�������mO������?�������qeJl
]q� �������>
�[�i�^�>��n��f�KS�]��)#��x���x}��n�e2�j/��>�I����[2�U��u#v4sc:��s/w&����;3y��9�1
�Z�+��@ @ �Y���v��$D @ )�<&f0��A��%C3J�'u��K5���W�D3���xs����+W&��T/7�t�9
f$ku���<���M�g��Z\�e�5��*w��7��1
�����m�&�?%}XO
�m�gk�
���P[d���hc�����K��w)_�D����uAf��5��,�`\��m������|i4d��/2^~y�Y�]�IOv��vd����K��H�4���=7Y�=z�@����M0V�c�������h�_�Q;��&��|�'+.�+?hy�c"�5��y+5f�)��r
u�D @ @ �K x�L�k��Gx5���jG�6iu����x���������$^�G�3����S�����c��%g&/
��}�w�b���4J41�&q����&��N��9#Mf+$w��mu�����_��=�N���y����,a�m�����:��������e&N�=�����
�l�w����k��0^���]H�g
��,0���� ��X����p @ @ @ ����
��3�q��h�������d���:�LHW�Ya��ybm+ �3��R��*���\����������&�?����I�Z��6��#��`Uj^-��/�R]/x�1�$x���i5Y�������"���tZ-��59����0���0�����p��F���+�!�Yn���]K�=]�"�:���mx���CW�k�6�u�cY�`? � ��n����%$ � )��c�Jf6�}�Wb�SM?�d5���,]R�y���1����h�� �P/(i�y��~=�����Z2�{F�`�����!�d��O,�FF~�2mr��>M�?'+�P<��2������,�si�L��K���'��$r�6�61F�����Z����G������]�e<x����o�l�����4��6��>���T�����^ ��w����/2������]g�J]t%�����������v�c�Z�y��h9��z � � ��1���k���+"����_�G'��'�8��4������&av�cQ^� ���:]+�}h����y<��������8�_����o9������O�~�����YFZp�.�-�f�4�9�������2
@�?/�w,�W�&����^���?�� �$�r���o�++y�n��+e�yh�z����{����lSz;�
��)-�\sT��e�+hs8>=/3v}�������V�k��������+����Y ���gU*S|�����oRjhg*Kaw���������?��7�,�I;S/����?�u�/KV4��8��v�u���o������7(\�1��yH�)8e{�}7���>y�R��MM�f�E��61�u�#�)DW����/L�R+J;SY
���,l��V�,�Y���e�O��z9v'�Y��K�xY�3��:����Y�[�����\t!����6��C}����B�=�F @ @ @����������<��@E�=���<�]^� ��9����{rn�Y � � Y�X��a? ��.�x�E�4 � ��h��� � � P
���f���� � ��{��=Kx!� �@��^���� ���xK�7{���7%�~e�ig�GH��Hwa������mB��F�,�S�s�Qq������G���,���e��nY�-�<���M
��m���\�Z�QC@�p��� � � �@�<���i= � � � � �xV2E @ @ @ @������@ @ @ @ ��x<+�"� � � ����-^��1 ���
x<���b �@q��R���� � � ���<�K�|@ @ @ @ @ �R�@ @ (����c�h= ��^<����@ T�-�@ @ @ ��@ @ @ @ �� �����! � � � � xd � � � � ���<���@ @ @ �b
������@�,<�e�P)@���h)v�Q{@ @ @`��H � � � � ��`@ @ (����cyO�@ �.P�n���{�d� P^��_����@�@ (� ��qT8S���^���?����Z��}+E;�'FNe��(So�����[��I)*P��)�yY����?�(P���\�E��u�?�m�x��,b�e�����6A @ @ @ @ �<�G2@ @ @ @ � ��m�@ @ @�T����j/�E 8� ��>� IDAT����; PJ~����i4 � � ��`@ @ @ @ �!@�q��� � � \�������& � �� ��s� �� \� ?Z.�3i
� � �� ���H� � � � � ��m8� � � � � ;
x��d � � � � �-@�1��# � � �@�nooK�^� pX���%w@���h)e��h@ @ @ &@�1�� � � � � �C���>�@ @ � ����hM@ 8����� ~�\Pg�@ @ @`G�;��@ @ @ @ �*���m6��gp@ �-�~�*>|�2�#� ����~��|@���������Z���o�����)Mgk�>�q�����[�d��7��{�v�a��aR��;�
U*K��oxY��v���?����v��T��d�� I@ @ @ �K�������@ �x<qP< p��h��^�M � � ��v���l@ @ @ @ �@�����@���T*M,W����4���Z�=K4�2
[joO�[���'�a��T���D�/��6N@ @ @`����{���@ �-P����SO���}�����[�J7
���������6���f�<
�U�N�g�M��<5A??}��j@��G��+���4�vL?I�������6���>����v�x�6���v���Q���S���p�%��
Jy�O�-��$'@ @ @���
<j���U�~k
]q�ah-��������F����2���"*r:�Q�Z>F����~jH������X�'u��L�R3��i�0MT�t<��$��3��I��@bZ�S�>\��k�B�����Ug�f1�n{9-�"��`������:���a�qe;���C @ @ @ � P����nl�DF�h�\��j�=W��l&�Z��pva8S�����6��x�RmVax��wR�=��`�����H�/�h.i^���q�Q6�;����[q�����x����9��6��=�VoV��� ����}u������~����AR�������|�n��?k0���f�z'U������v���ijW���|e;�%A @ @ @ �Z���G��8qD�qa�*���r���~#���/r�t��se"m��3��7��Y�pKK��+� [�1/Y��\�(W����������t� �i�y���b����%����Z�J�8�l�F�bM����m-C��m�I��#��7{�������>�M��z'��l���W�R�g4�Yl������
�&_�+?hy�c^=4 mW(�m�3 � � � P �R��?:;q6�I-j�����yDHCRi)�zA8oi��"�`�?s��k�E�f� _bY��/�y~�q���~�VpQ�'���Yi�
K���'��#i�AG�4��f���Q�,%O-/���a9��^}bw5Xl-��82jg����u�������@r�4��T@ @ �� �������@�`w�u&��]W���[f�A8]f�-�����~��x��c=�e������b����&��3������L��{�F-���M\z�I�ZJn�:~#�Yg>���Qu�vMg��N)�N���g�/�y�QB��z�c� sr���(� ?Z��O�@ @ 8� ��t������!?I�[Z��0������fOr����E�V���>L�Sx�����d�.����q��K;x(L��\�$�V������ 8��z����}fP2l�v���~��������<����c��������@ @ @ @�X7�/]-��e�w�Y�l�^����)zy��/�_Z=���r���8�9b=H:�,s�ZR�u376����2��� ����������2��=��<]:���Q�e$��e��p���ai^������ �A?f9�o>�����P�;I��#�'{W�t���'���c��$g�&���@ @ ��<>>�J@ (�@e8����s������7y��}�~�:��qG\��9��C
u�������2+p��Cr�XM��8�ty��-,�.O��e�����M�+��<��1J*K��oxY��v���xe����>FIe�Of<c4�s/J��������7[q���P����B�Cr�(��� � � � ����w����@R�%C�[�Hn'�_��K��5���p�1�]
� � � � *����X�� � � � ��)�{<6��S���@ .L��������k�A ���?���p6 � g(���gX+�To�u�R��ve�Q����8���������%�S����T�z�����Wy�����@ �� ���c���������������0 � � � � �/@�����"� � � � P8���2*� �������W�"� � � T���Ay���
L�[������k���\����}*��@�{��12����9E�\�X8�}��\�Y�H���A�.�ASU@ @ �M<n��9 �"0�V��5�E
r�G��2�]�x?��wE��/��-M+�}�oM{V�7�1�$�zGZ����S��'�u���������-9���K������u���gp�3pM�P����H!��,w�x�k,~�X[�a],3j� � �F���i�)����������9�n{.��+�������n��>���t-��u����u�t<���OZGF���<�gL���yc��`
�3��L�k��H�W�����QOQ���D�1��~��S��2@ @ �*@���=G�8�����f���J�c^���t��t�IMz���j�X�x_����@bfYt,���lg��1L�2r���=�;�wf��e)��1i���N�����LW�u����I?3#+Z�����K|�e�eX���G
���e�3��%��+i��G[
������q�����1��������_�������^M�v����u�q����z����]��X_��}Fo��5K�|w������9���<�#�Q;�]lR�^�]���B*M+�|�������@g�{�����$��������Y��� � pn��G�'����������H�����$���0�W�*w��7���0X7����?q����4x���8o��Y��*�?�:_IX�jx�v=����m��sr�bX�H����Vo���m��a0c���9�K�F������L��2���|��.�D'o��$���
FK����n�����6���c*,Id���K��3,�s�����!���Yrc�V�H;��'2����K���q����v=�]�Y��7��a�fTO����?�I5�� >�;Tg4No&h�]��aim��3\���yu����J�}k�F��<���L������#/���/�:��C @ 8� ��S�S6�(���3{���6��$r/� ���6Y��7���Y��y��.^d�5;��s&����0p��z����W���0(�3�f�����UE�"e�t�]�Z4n�����O\�J,�����T2�3�`r���a>zL���\1����f��l�ZO������5����&�����
�}o�� ����c��1�����9?���I{��hK��_���:+����� ��_�(�
�;h]���#o � � �/@�����"pd
��Do�x�K��#�@�.�ud�[�T��{����h�g�Iw9V��g�$�z{������ebf=.�����8��O\��hf[ud.s���fG����4c�4e����Y�F����=]1�����c��usf+����y�]�z���������[E�X���CsdV�X�0���3� � ��� x<��>���������W�w5�]�e<x������T����C���`�a���~�SsZ$g�`XX���4F�y���]�y�7�ox�3�$+X7���`��w�E����
������p��l����Y�������$gG��{|||��v�1e��k��c,X�������0�����#�$����e�=y�8����F�������3\�r���i �}oM��5�@ @ �@���tU@���e�#�_����?����3�;V����<\&�]�;y�{��hK$E����?v#��m�)#/�]�����\�*�l�L��$��������5�I�Z�)Nv{�w�Y�m\*��{���]��JMz�I4#�[��� r�
y��Sv���w^:��-|���ki.ms��]KY��6y�"������v�{��b��I�yc5��?d��e�I����};�OX��xG @ 8;��p8t'����Up_������~_��%������������|K�[�����;�������O�wKSWO{���h?`�M�&1c��]uI�Sz
>\-�%�^���%�������w��9�\�������?��W������/~��X����/������c�������������|����M��?�S�����+��?����_�������~
%��6���2&�Y��o�%����N,K;��C� �@�tVa{,�a+z��J��{uW���g������P�I�g�����zcv�� '���-���4��A����ZMoFs�WV]40����F~�������/VA
������@����;�X2hj��6.�����X>�������`�W�|-O����)'�_����y�����������s�l�����Lu;y��#i�nm�y��g?����fS�}d��9��y4�M���6�������c*9M�xG �rx,G?�J8���C�\��e�s�u����&u�����S5O�K�w�*�r@`
�h0�Z����t4A�_��W+i���z^^^bAP�3��\��>�]�������K����dh�������A�������Y�v>�g
`i�v�J�������������e���O�S� � ������}�]��f3������ � :��J
��#�&4��������
��4�����w(� �x,O_�R@�h�[f}4Z
B 8��.�5�U!,V=:�.`4���$� ���l4���d��ij����@��z~~���
�������05�D7�[��]W
�&�%//
*j�N��W������=;�g�����C����l_�>��@�<��G]W^�WY�Y���������i�s�{�2��<�v��a\��p�Gt����`f������������-M>�K�uf���,A3+�TX�1��7����"]�m�4�c�����A-�����e������vh��~�����G����j�/sHchN4}bfe&gjzuJ��S���F�6�&�u����;�����b�������|�^�}&�}�����`w:M���k�ES��%p��v�2O�+eh�x,�CW�3���S\(�Y|�����9����s�����,�S���c4��zC~^�� �.e�{�@��BOY/
$�`�(4AG{)�)����g�}����<�q&�cMZ
-��p����!,�xi�V����sqZ_5�_��`f$�AGMc�C�_f���Ts�Ei�d&p��]��?m�.H�+��������]
�e�g�i�U��/='���?VZ�?��?�����M���y[������?T����=�xL��� �f�r>���ld� pa��`�������� � j�5�d�������4z��4��
Fi>&?�25
��/=nfg���>����Xr��:���tf� ����]ovy:�S�����7�e�hi�tf���4AG=G����y�2������C�V�H�1K��-G�:�?
tj{��a��w
zjYv0U���2�Qm�����>��YAFy}w.�� p8���%g@ @ (���6 �%g�mJ�\2�A0}�~
J�K��1�!ogp���[��Y���n��4��w|������Y����=��N���&�i;5���d�T�����l��&